At BPTN – Black Professionals in Tech Network we’re pushing the future of tech forward by creating a space for Black professionals in tech to gather, grow and evolve – all while being a conduit for companies to engage this talent across North America.
We’re here to help Black professionals network, connect with one another, share resources and grow their careers. Our rapidly growing network counts over 50,000 Black professionals. We provide our members with access to mentorship, skill-building opportunities, and a strong peer network to support professional growth and advancement.
Our client is looking for a Data Engineer who will generate value by transforming raw data from a complex array of source systems into user-friendly data assets which will be used by the company’s team of data scientists and insights analysts. You will develop, test, deploy, and maintain software to create complete and accurate data, using a variety of big data platforms, languages and tools, including Hadoop, Spark, Scala, Git and Maven. You will collaborate with a highly talented team of data engineers, data scientists, data architects, and data analysts in order to deliver business value to the organization. You will develop recommendations for new ways to improve data quality throughout the data transformation lifecycle.
- Develop, test, and deploy software to generate data assets for use by downstream insight analysts and data scientists
- Work with big data and cloud technologies such as Spark, Hadoop, Hive, S3, EMR, EC2, Lambda, and Kafka
- Work closely with stakeholders to ensure successful data asset design and development
- Join data across multiple data environments, such as HDFS and Data Warehouses, using complex queries
- Use Scala, Spark, GitHub, Maven, Jenkins and Airflow to develop and deploy automated data-producing software packages
- Create software artifacts and patterns for reuse within the enterprise
- Ensure ETL pipelines are produced with the highest quality standards and validated for completeness and accuracy
- Work on a cross-functional Agile team responsible for end-to-end delivery of business needs
- Help develop new solutions for batch and real-time data and analytics use cases
- Help improve data management processes – acquiring, transforming and storing massive volumes of structured and unstructured data
- Work closely with development teams to learn about needs, current processes and to promote best practices.
- Bachelors in Computer Science, Software Engineering, Mathematics, or other STEM majors. Masters/PhD considered an asset.
- Professional software development experience with Scala, Spark, Hadoop, Java, Linux, and SQL
- 3+ years of experience in the big data ecosystem, with Hadoop (Pig, Hive, HDFS), Apache Spark, and NoSQL/SQL databases
- Experience using Git and Maven while collaborating on a software development team
- Experience using ETL big data pipelines (Apache Airflow), knowledge of CI workflows and build/test automation
Nice to Have
- Experience with other analytics programming languages (Python and R)
- Experience with other data analytics and visualization tools such as Tableau
- Experience with Agile software development
- Experience with DevOps concerns, including CI/CD
- Toronto, ON