For a freelance project we are looking for a

Senior Data Engineer (f/m/d)

Development of the clients custom vulnerability and exposure management platform.

Responsibilities

  • Design, develop, and maintain backend components of our vulnerability and exposure management platform using mainly Python.
  • Implement and optimize ETL (Extract, Transform, Load) processes using tools like Airflow, Spark, Dremio, and Iceberg.
  • Integrate with various data sources (APIs, databases, etc.) to collect asset and vulnerability data.
  • Develop and maintain APIs for data access and integration with other systems.
  • Ensure data quality, accuracy, and consistency throughout the platform.
  • Collaborate with frontend developers to deliver seamless user experiences.
  • Contribute to the platform’s architecture and design, ensuring scalability and performance.
  • Write unit and integration tests to maintain code quality.
  • Participate in code reviews and contribute to best practices

Profile

  • 4+ years of professional experience in data engineering using Python and PySpark (Must-have)
  • Advanced modeling and query optimization in Neo4j (Cypher, indexes, APOC) (Must-have)
  • Deep expertise tuning PySpark jobs (partitioning, caching, broadcast joins, cluster sizing)
  • Hands-on ownership of ELT pipelines orchestrated with Apache Airflow 2.x
  • Production experience exposing and accelerating queries with Dremio
  • Proficiency with relational and NoSQL databases supporting analytics workloads
  • Experience designing and implementing data-centric RESTful APIs
  • Fluency with Git workflows and automated CI/CD (GitHub Actions, GitLab CI)
  • Excellent problem-solving skills and ability to deliver RCA under tight SLAs
  • Able to work autonomously and drive best practices across cross-functional teams
  • Fluency in English required


Bonus:

  • Hands-on with AWS EMR or GCP Dataproc for scalable Spark clusters
  • Containerization and deployment with Docker and Kubernetes
  • Contributions to open-source projects (PySpark, Airflow, graph-DB tooling)
  • Active follower of advances in PySpark, Arrow, and graph database ecosystems


Completely remote for the location in Barcelona

Apply online

Project published
07.10.2025
Start date
asap
Period
3 Months
Postcode
DE 7XXXX
Sector
IT-Services
Capacity
5 Days per week remote
CA-Number
CA-99711
Your contact person
Melina Linder
melina.linder@etengo.de
Back to project search

Nicht das richtige Projekt? Richten Sie sich Ihren Projekt-Alert ein.

Ein Mann sitzt an einem Bürotisch und arbeitet an einem PC

Passende Projekte. Sofort per Mail.

Hier finden Sie die aktuellen Projektausschreibungen von Unternehmen aller Größen und aller Branchen, wo Sie als freiberuflicher IT-Experte die digitale Zukunft erfolgreich mitgestalten können.

Set up Project-Alert