Social network you want to login/join with:
Python Developer with Pyspark, england, england, united kingdom
col-narrow-left
Client:
N Consulting Ltd
Location:
Job Category:
Other
-
EU work permit required:
Yes
col-narrow-right
Job Reference:
b0c51035b40a
Job Views:
7
Posted:
26.04.2025
Expiry Date:
10.06.2025
col-wide
Job Description:
Job Title: Python Developer with PySpark
Job Type: Contract
About the Role:
We are seeking a skilled Python Developer with expertise in PySpark to join our dynamic team. The ideal candidate will have strong experience in building and optimizing large-scale data processing pipelines and a deep understanding of distributed data systems. You will play a key role in designing and implementing data solutions that drive critical business decisions.
Key Responsibilities:
* Develop, optimize, and maintain large-scale data pipelines using PySpark and Python.
* Collaborate with data engineers, analysts, and stakeholders to gather requirements and implement data solutions.
* Perform ETL (Extract, Transform, Load) processes on large datasets and ensure efficient data workflows.
* Analyze and debug data processing issues to ensure accuracy and reliability of pipelines.
* Work with distributed computing frameworks to handle large datasets efficiently.
* Develop reusable components, libraries, and frameworks for data processing.
* Optimize PySpark jobs for performance and scalability.
* Integrate data pipelines with cloud platforms like AWS, Azure, or Google Cloud (if applicable).
* Monitor and troubleshoot production data pipelines to minimize downtime and data issues.
Key Skills and Qualifications:
Technical Skills:
* Strong programming skills in Python with hands-on experience in PySpark.
* Experience with distributed data processing frameworks (e.g., Spark).
* Proficiency in SQL for querying and transforming data.
* Understanding of data partitioning, serialization formats (Parquet, ORC, Avro), and data compression techniques.
* Familiarity with Big Data technologies such as Hadoop, Hive, and Kafka (optional but preferred).
* Hands-on experience with AWS services like S3, EMR, Glue, or Redshift.
* Knowledge of Azure Data Lake, Databricks, or Google BigQuery is a plus.
Additional Tools and Frameworks:
* Familiarity with CI/CD pipelines and version control tools (Git, Jenkins).
* Experience with orchestration tools like Apache Airflow or Luigi.
* Understanding of containerization and orchestration tools like Docker and Kubernetes (preferred).
Experience:
* Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
* 5+ years of experience in Python programming.
* 4+ years of hands-on experience with PySpark.
* Experience with Big Data ecosystems and tools.
#J-18808-Ljbffr