ETL
Bangalore (Bangalore Urban) IT development
Job description
· Strong understanding & familiarity with all Hadoop Ecosystem components and Hadoop Administrative Fundamentals
· Strong understanding of underlying Hadoop Architectural concepts and distributed computing paradigms
· Experience in the development of Hadoop APIs and MapReduce jobs for large scale data processing.
· Hands-on programming experience in Apache Spark using SparkSQL and Spark Streaming or Apache Storm
· Hands on experience with major components like Hive, PIG, Spark, MapReduce
· Experience working with NoSQL in at least one of the data stores - HBase, Cassandra, MongoDB
· Experienced in Hadoop clustering and Auto scaling.
· Good knowledge in apache Kafka & Apache Flume
· Knowledge of Spark and Kafka integration with multiple Spark jobs to consume messages from multiple Kafka partitions
· Knowledge of Apache Oozie based workflow
· Hands-on expertise in cloud services like AWS, or Microsoft Azure
· Solid understanding of ETL methodologies in a multi-tiered stack, integrating with Big Data systems like Hadoop and Cassandra.
· Experience with BI, and data analytics databases
· Experience in converting business problems/challenges to technical solutions considering security ,performance, scalability etc.
· Experience in Enterprise grade solution implementations.
· Knowledge in Big data architecture patterns [Lambda, Kappa]
· Experience in performance bench marking enterprise applications
· Experience in Data security [on the move, at rest]
· Develop standardized practices for delivering new products and capabilities using Big Data technologies, including data acquisition, transformation, and analysis.
· Define and develop client specific best practices around data management within a Hadoop environment on Azure cloud
· Recommend design alternatives for data ingestion, processing and provisioning layers
· Design and develop data ingestion programs to process large data sets in Batch mode using HIVE, Pig and Sqoop technologies
· Develop data ingestion programs to ingest real-time data from LIVE sources using Apache Kafka, Spark Streaming and related technologies
· Strong UNIX operating system concepts and shell scripting knowledge
· Flexible and proactive/self-motivated working style with strong personal ownership of problem resolution.
· Excellent communicator (written and verbal formal and informal).
· Ability to multi-task under pressure and work independently with minimal supervision.
· Strong verbal and written communication skills.
· Must be a team player and enjoy working in a cooperative and collaborative team environment.
· Adaptable to new technologies and standards.
· Participate in all aspects of Big Data solution delivery life cycle including analysis, design, development, testing, production deployment, and support.
· Minimum 7 years hand-on experience in one or more of the above areas.
· Minimum 10 years industry experience
· Strong understanding & familiarity with all Hadoop Ecosystem components and Hadoop Administrative Fundamentals
· Strong understanding of underlying Hadoop Architectural concepts and distributed computing paradigms
· Experience in the development of Hadoop APIs and MapReduce jobs for large scale data processing.
· Hands-on programming experience in Apache Spark using SparkSQL and Spark Streaming or Apache Storm
· Hands on experience with major components like Hive, PIG, Spark, MapReduce
· Experience working with NoSQL in at least one of the data stores - HBase, Cassandra, MongoDB
· Experienced in Hadoop clustering and Auto scaling.
· Good knowledge in apache Kafka & Apache Flume
· Knowledge of Spark and Kafka integration with multiple Spark jobs to consume messages from multiple Kafka partitions
· Knowledge of Apache Oozie based workflow
· Hands-on expertise in cloud services like AWS, or Microsoft Azure
· Solid understanding of ETL methodologies in a multi-tiered stack, integrating with Big Data systems like Hadoop and Cassandra.
· Experience with BI, and data analytics databases
· Experience in converting business problems/challenges to technical solutions considering security ,performance, scalability etc.
· Experience in Enterprise grade solution implementations.
· Knowledge in Big data architecture patterns [Lambda, Kappa]
· Experience in performance bench marking enterprise applications
· Experience in Data security [on the move, at rest]
· Develop standardized practices for delivering new products and capabilities using Big Data technologies, including data acquisition, transformation, and analysis.
· Define and develop client specific best practices around data management within a Hadoop environment on Azure cloud
· Recommend design alternatives for data ingestion, processing and provisioning layers
· Design and develop data ingestion programs to process large data sets in Batch mode using HIVE, Pig and Sqoop technologies
· Develop data ingestion programs to ingest real-time data from LIVE sources using Apache Kafka, Spark Streaming and related technologies
· Strong UNIX operating system concepts and shell scripting knowledge
· Flexible and proactive/self-motivated working style with strong personal ownership of problem resolution.
· Excellent communicator (written and verbal formal and informal).
· Ability to multi-task under pressure and work independently with minimal supervision.
· Strong verbal and written communication skills.
· Must be a team player and enjoy working in a cooperative and collaborative team environment.
· Adaptable to new technologies and standards.
· Participate in all aspects of Big Data solution delivery life cycle including analysis, design, development, testing, production deployment, and support.
· Minimum 7 years hand-on experience in one or more of the above areas.
· Minimum 10 years industry experience