Data Engineer -
• You are a data evangelist always ready to get your hands dirty by jumping right into engineering data solutions for the team.
• You are an expert in ingesting, extracting, moving, transforming, cleaning, loading massive structured and unstructured data in Hadoop environment both in batch and real-time.
• You are a SQL, scripting language and object-oriented language “guru” and are always on a lookout to optimize and automate pipelines.
• You have solid understanding of tools and solutions used to serve machine learning model outputs to millions of users.
• You are an expert in policies and processes related to data security and data privacy.
• You interface with leadership and team members to get clarity on your requirements, end goals and deliverables.
• You are detail oriented but never loses sight of deadlines.
• You always have an eye on the products you deployed, always strives for better end-user experience and never assumes that everything is ok.
• You are comfortable following the procedures and coding standards set by the organization simultaneously have the ability to define improvised procedures and standards.
• You have MS or BS in Computer Science.
• You have experience in sales or marketing functions for consumer products, retail, tech, telecom, or financial industries.
• You have 5-10 years of experience in Big Data environment specifically Hive/Spark where you have deployed reliable data models that scale smoothly on high-volume (1TB+) & high-dimensionality (500+ variables per schema) data.
• You have worked with massive structured/unstructured data sets before and have a strong grip on deploying and maintaining machine learning models.
Must have Soft Skills –
• Proactive and sometimes Reactive
o "Jump right in" attitude
o Need to be an independent contributor simultaneously valuing feedback from team members
• Work Style –
o Strong team player
o Detail oriented but never lose sight of deadlines
• Procedure Oriented
o Clearly understand and agree on the standards set by the leadership and how they will be evaluated
o Maintain standards and also recommends new procedures and systems
• Interpersonal skills-
o Need to be able to empathize with team and be able to control his/her internal state to handle the work load during ambitious targets.
o Given ambitious deadlines be proactive and extract clarity on requirements, deliverables and tasks from PM and leadership
• Quality Control –
o Constantly check on pipelines and not assume that things are ok.
• Work with Data Scientist to Design, Architect and Deploy scalable and reliable Data Products
• Ensure the data product is deployed with through testing and is of highest quality
• Ensure that data products are secured and private
• Interface with team members regularly to brain storm ideas, communicate progress and risks.
• Follow Agile development principles
• Use JIRA and Confluence as tools for Agile Project Management
• Communicate the progress in clear, concise and timely manner with other stake holders
• Architect, design and develop solutions to ingest, extract, transform and load massive structured and unstructured data sets in Hadoop environment
• Work with DS and PMs to clearly understand the data requirements
• Always on a lookout to optimize the data pipelines in terms of execution times and resource utilization
• Have a special eye on Data Quality
• Debug the data pipelines with ease
• Get knowledgeable in the in-house Data Science tools
• Always looking for avenues for automation in data pipelines wherever applicable
• Follow engineering standards set by organization simultaneously preaching for optimal and efficient procedures
• Comfortable to perform Production support, job scheduling/monitoring, data freshness reporting activities.
Compliance - Responsibilities
• Ensure completeness of the solution
• Make sure that the data is of highest quality
• Adept in documenting the data modeling work. Ensure that the documentation is complete and reviewed.
• Ensure code meets team standards as well as define new coding standards and practices
• JIRA for agile development
Should have Knowledge and Experience
• Experience in Telecom/retail/e-commerce/consumer packaged goods (CPG), Mobile or Consumer electronics industries
• Agile Principles/Scrum/Kanban
• Expert in Hadoop (Hive and Impala) & Spark (Spark/Scala)
• Proficient in a scripting language
• Proficient in object-oriented language – Python/Scala preferred
• Expert in SQL development
• Proficient in using Big Data stack (Hadoop, Hive, Spark, Kafka, Kerberos, OOZIE, impala, etc.)
• Adept in multi-threading and concurrency concepts.
• APIs and Micro services.
BTI Recruiting Team
801 E Campbell Rd. Suite 230, Richardson, Texas 75081, USA