DevOps DataOps Architect

Join DataOps Architect TalentCloud

If you possess mastery in any of the roles or skills below, you can apply to this TalentCloud. Once you become an approved Experfy TalentCloud member, you will get exclusive access to jobs and project opportunities from our clients.

Cloud Description

Experts in this TalentCloud should be able to orchestrate the data platforms and frameworks to enable enterprises to rapidly adopt a modern data strategy and robustly manage unlimited amounts of data. Other responsibilities include:

  • TalentCloud members should be full-stack data architects with deep expertise in orchestrating and tying together the different tools, frameworks, libraries, vendor products with a configuration driven approach to automate end to end data engineering, data preparation, and data analysis
  • Should be able to manage the data pipeline deployment, define the data-driven application process and create an automated/reproducible delivery pipeline that assists in building data platforms efficiently
  • The individual will play a key role in data ingestion automation, data storage, and data processing provisioning and in creating and creating a data delivery pipeline for the end consumers
  • Experts will work closely with technical product managers, solution architects, data architects, data scientists, and data QA engineers, and other internal stakeholders to address issues in the data pipeline and to make sure data is ingested accurately and on time
  • Work in a high paced and rewarding environment with bleeding-edge technologies and innovative concepts
  • Facilitate the data development process and operations
  • Creating suitable DataOps channels across the organization
  • Establishing continuous build environments to speed up data development
  • Designing efficient practices and delivering comprehensive best practices for DataOps engineering
  • Managing and reviewing technical operations
  • Working closely with the development and QA and data teams to operationalize and automate end to end development and data management practices
  • Analyzing, executing, and streamlining DataOps practices and scheduling the data pipelines, managing the workflows, and coordinating the deployments with product teams
  • Automating processes with the right tools, Guiding data developers and operation teams in case of an issue
  • Monitoring, reviewing, and managing technical operations
  • Ability to manage teams with a leadership mindset
  • Ability to work in a dynamic, Agile based development environment and responsible for providing operational support and bug/fix to existing applications as well as supporting enhancement release and new development

Required Skills

  • Exceptional knowledge of the DataOps landscape:
    • Data orchestration tools: Airflow, Oozie, reflow, datakitchen
    • Automated testing and production data quality monitoring tools: Datadog, ICDEQ, NAVEEGO, FirstEigen, Tricentis TOSCA
    • Data versioning tools: dolthub, deplhix, pachyderm, model management tools - domino, seldon, ParallelM, MLFlow,
    • Code and artifact storage (e.g. git, dockerhub, etc.)
    • Parametrization and secure key storage (e.g. Vault, jinja2)
    • Distributed computing (e.g. mesos, kubernetes)
    • DataSecOps, Versioning, or Test Data Management
  • Strong knowledge in traditional release engineering/build engineering practices
  • Strong experience with tools as Jenkins and CircleCI
  • Experience with infrastructure management and monitoring
  • Strong knowledge of DevOps platform tooling (OpenShift, Ansible, Chef, Puppet, and Docker, Kubernetes)
  • Strong knowledge of Unix Scripting languages - bash, awk, sed, Windows Powershell
  • DevOps architect should automate the process with proper tools
  • Evaluating, implementing, and streamlining DevOps practices
  • Establishing a continuous build environment to accelerate software deployment and development processes
  • Helping operations and development teams to solve their problems
  • Providing a DataOps process and operations
  • Experience in handling automated deployment CI/CD tools

Preferred Skills

  • Adept at data pre-processing & complex data transformations using Hive programming
  • Experience in working with HiveQL and performance tuning of Hive queries
  • Extensive experience on distributed systems frameworks like Hadoop including hands-on experience on HDFS, MapReduce, etc.
  • Proficient in working with large-scale multi-node clusters with technologies like HDFS/ Hive etc.
  • Excellent SQL skills and in-depth understanding of relational databases
  • Adept at working on UNIX environments and exposure to shell scripting
  • Knowledge of database design principles like OLAP, OLTP, and ETL
  • Experience in Python and Java will be an advantage
  • Familiarity with big data platforms like Qubole/AWS will be an advantage
  • Work with an innovative bend of mind to develop creative solutions for data onboarding