Global Banking Self Service Data Science Platform

Global Banking Self Service Data Science Plataform

Global Corporate Bank required a Self-Service Data Science platform for business users and clients, built using primarily open-source technologies with the aim to deploy into hybrid infrastructure.

The solution uses latest practices in Auto ML to automate Data Prep, feature Engineering, model Training and Model Testing for the acceleration of analytics Use Cases & AI Assisted Analytics. Provides feedback loops, with persona target visualisation covering DSL, no-code / lower-code topologies; OpsML – for the automation of Data Science workflow, model packaging and serving for integration with CI/CD and deployment pipelines in Kubernetes. Offers full elasticity, flexible resource planning & agnostic infrastructure platform; Synthetic data – to deliver domain trained models and enable fast model training across sample data sources; Data Fabric – to abstract physical data stores and locality from consumers via the use of Service Mesh, Data Unification, Virtualisation and Federation.

Main features

  • Designed solution architecture & development of agnostic platform infrastructure as code based on containers and deployable over Kubernetes to multiple locations. Development of Data Fabric Store to handle structured, semi-structured and unstructured data types.
  • Data Ingress and Egress interfaces for data consumption serving of streams and batch processes.
  • Worked with internal business and technology teams to provide hands on expertise and deliver Data Science models straight to production.
  • Developed an AutoML and OpsAI automated ecosystem that allows data scientists, business analysts and the wider business to build, train, test and deploy AI models against actual data in the application domain.
  • Advisory and consultancy for Data Science techniques – providing education to the enterprise in terms of methodologies, operating models and processes for real-time production environment; Metadata management: Creation and maintenance of data catalogues, dictionaries, traceability, lineage and segregation.

Key Points

Minimum administration and operation costs/risks

by leveraging Kubernetes deployment capabilities for a full elastic and flexible infrastructure.

Current State Analysis

of capabilities and provision of roadmap for the development of analytics capability

Provided PoVs

for key business use cases

Automation of Data Science

lifecycle for AI Business enablement and significant time reduction to deliver use cases

Reduce licence costs

by committing to open-source technologies where appropriate and to enable the rapid application of new releases