Indicina is a venture-funded FinTech building the technology infrastructure that will power the next generation of consumer credit platforms and businesses. We genuinely believe that risk innovation can unlock the massive African consumer credit opportunity. Only 11% of Africa's population have their credit information recorded by private credit bureaus vs 17% in Emerging Asia and 79% in Latin America. This, along with other challenges puts a major brake on consumer lending in Africa.
- We are looking for a savvy Senior Big Data Engineer to join our growing team of data science experts.
- You will be responsible for creating, expanding, and optimizing data sets and data pipelines, as well as optimizing data collection and workflows used by our cross-functional Data Science team.
- As a data engineer, you will partner with data scientists and product teams, to build full-stack data science solutions using data from some of the world’s largest cloud apps.
Job Duties and Responsibilities
- Optimizing ETL processes: Assemble large, complex data sets to undertake a wide range analyses that answer a variety of questions
- Data architecture: Design data schema and partner with engineering teams to implement high-perf views, including cloud-based workflows in R and Python
- Work closely with Data Scientists to optimize and re-engineer model code to be modular, efficient and scalable, and to deploy models to production.
- Configure, deploy, manage, and document data extraction, transformation, enrichment, and governance process in cloud data platforms, including AWS
- Build visualizations to help derive meaningful insights from data.
- Partner with data scientists, PMs, engineers and business stakeholders to understand business and technical requirements, plan and execute projects, and communicate status, risks and issues.
- Perform root cause analysis of system and data issues and develop solutions as required. Design, develop, and maintain data pipelines and backend services for real-time decisioning, reporting, data collecting, and related functions
- Manage CI/CD pipelines for developed services
- Product analytics: Exploratory analysis and answer questions at various levels of product development
- Communication: Communicate key insights and results with stakeholders and leadership. Interpret data/insights and use storytelling and data visualization to recommend product decisions.
Qualifications:
- Bachelor degree in Computer Science (or related field) or 4 years of production experience
- 3-5+ years’ demonstrated experience with Big Data systems, ETL, data processing, and analytics tools.
- 5+ years architecting, building, and maintaining end-to-end data systems and supporting services
- 2+ years’ experience with relational databases as well as working familiarity with a variety of big data sources in Spark, Scala and/or other big data systems.
- 2+ years’ experience building and optimizing “big data” data workflows including data lake architecture, business intelligence pipelines and data visualization tools such as Power BI, Tableau or Metabase.
- Experience with Amazon Web Services (AWS) such as Athena, Redshift, Glue, and others.
- Monitor, validate, and drive continuous improvement to methods, and propose enhancements to data sources that improve usability and results.??
- Demonstrated proficiency in writing complex highly optimized queries across diverse data sets, and implementing data science applications in R and Python using cloud-based tools and notebooks (Jupyter, DataBricks, etc.).
- Experience working with large datasets (terabyte scale and growing) and tooling
- Production experience with building, maintaining, improving big data processing pipelines
- Production experience with stream and batch data processing, data distribution optimization, and data monitoring
- Experience maintaining a large software system and writing a test suite.
- Experience with Continuous Integration, Version Control such as git.
- Deep understanding of data structures and common methods in data transformation.
- Excellent written and verbal communication skills??
Preferred
- Exposure to ML/AI techniques
- Experience working with large data sets in SQL/Spark/Hadoop/Data Lake or similar
- Experience with data ingestion and transforming data using tools such as Logstash, Fluentd, FluentBit or similar.
- Experience with data warehousing with tools such as AWS Redshift, Google BigQuery, etc
- Experience incorporating data processing and workflow management tools into data pipeline design
- Experience developing on cloud platforms (i.e. AWS, Azure) in a continuous delivery environment
- Ability to provide technical leadership to other level.
Benefits
- Competitive salary
- Annual training allowance
- Work Tool + Internet Allowance
- Paid Time Off (20 days plus national holidays)
- Health Insurance
- Flexible work opportunities
- Group Life Insurance
- Performance Bonus
- Parental Leave
Method of Application
Signup to view application details.
Signup Now