Job title: AWS Data Engineer – Remote
Job description: Job description
ClearScale is a leading cloud systems integration company and AWS Premier Consulting Partner providing a wide range of cloud services including: cloud consulting, architecture design, migration, automation, application development, and managed services.
We help Fortune 500 enterprises, mid-sized business, and startups in verticals like: Healthcare, Education, Financial Services, Security, Media and Technology succeed with ambitious, challenging, and unique cloud projects. We architect, develop, and launch innovative and sophisticated solutions using the best cutting-edge cloud technologies.
ClearScale is growing quickly and there is high demand for the services we provide. Clients come to us for our deep experience with Big Data, Containerization, Serverless Infrastructure, Microservices, IoT, Machine Learning, DevOps and more.
ClearScale is looking for an experienced Data Engineer to participate in a custom data pipeline development project.
- Migrate data located in a multitude of data stores, into the Data Lake.
- Orchestrate processes to ETL that data, slice it into the various data marts.
- Manage access to the data through Lake Formation
- Build a data deliver pipeline to ingest high volume of the real-time streams, detect anomalies, slice into the window analytics, put those results in the Elastic search system for the further dashboard consumption
- Analyze, scope and estimate tasks, identify technology stack and tools
- Design and implement optimal architecture and migration plan
- Develop new and re-architecture solution modules, re-design and re-factor program code
- Specify the infrastructure and assist DevOps engineers with provisioning
- Examine performance and advise necessary infrastructure changes
- Communicate with client on project-related issues
- Collaborate with in-house and external development and analytical team
- Hands-on experience designing efficient architectures for high-load enterprise-scale applications or ‘big data’ pipelines
- Hands-on experience utilizing AWS data toolsets including but not limited to DMS, Glue, Data Brew, EMR, SCT
- Practical experience in implementing of big data architecture and pipelines
- Hands-on experience with message queuing, stream processing and highly scalable ‘big data’ stores
- Advanced knowledge and experience working with SQL and noSQL databases
- Proven experience in re-design and re-architecting of the large complex business applications
- Strong self-management and self-organizational skills
- Successful candidates should have experience with any of the following software/tools (not all required at the same time):
- Python and PySpark – strong knowledge especially with developing Glue jobs
- Big data tools: Kafka, Spark, Hadoop (HDFS3, YARN2,Tez, Hive, HBase)
- Stream-processing systems: Kinesis Streaming, Spark-Streaming, Kafka Streams, Kinesis Analytics
- AWS cloud services: EMR, RDS, MSK, Redshift, DocumentDB, Lambda
- Message queue systems: ActiveMQ, RabbitMQ, AWS SQS
- Federated identity services (SSO): Okta, AWS Cognito
- We are looking for a candidate with 5+ years of experience in Data, Cloud or Software Engineer role, who has attained a degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field
- Usage of HUDI with AWS Data Lakes
- Graph databases development and optimization 3+ years
- Neo4j, SPARQL, GREMLIN, TinkerPop, Pregel, Cypher, Graph Databases, Amazon Neptune, Knowledge Graphs
- Valid AWS certificates would be a great plus
What’s in it for you?
- Competitive Salary! Excellent Medical Benefits!!!
- Generous Vacation Benefit – Uncapped Paid Time Off
- Opportunity to build a leadership career in the fast-growing Cloud industry with an industry leader.
- Collaborative, high-energy culture
- 100% Remote Opportunity – distributed workforce – everyone works from home!
- Learning opportunities
Powered by JazzHR
Location: Dallas, TX
Job date: Tue, 22 Nov 2022 04:58:54 GMT