Creating a centralized platform for data hosting 


The client is a leading pharmaceutical company that, following its merger with a European pharmaceutical, now has over 30 manufacturing sites all over the world. The merger added central European production units in addition to the ones existing in Japan and The Americas. The company manufactures medicine for patients on 4 platforms: Small Molecules, Biologics, Plasma & Cell and Gene.

 Project Brief/Ask

The client was undergoing digitization of their manufacturing data to generate insights and improve product quality. The manufacturing data generated at each site was utilized for different use cases and each of the sites had numerous data connections and individual data pipelines which were heavy for the data servers on-premise. They wanted to reduce the burden on their data servers, and they approached our client to integrate the quality and supply chain data from sources at various locations globally consisting of the databases and analytical platforms to a cloud-based platform. The central platform functioned as a repository for hosting the data from multiple data sources/sites. The platform enabled the users to retrieve required data which is then utilized by several teams for reporting and dashboarding use cases.

 Overall Solution

The project is designed under Agile methodology, with the build sprint for 4 weeks and testing/deployment in the following 4 weeks. Different source systems are integrated into the data backbone. The data extraction is applied to sources either on cloud or premise servers, which are spanned globally from countries in Asia to Latin America. Connections are set up into the sources which then ingest the data using strategies such as real-time replication streaming, and batch loads over numerous layers. Data storage and computation are done in different buckets and layers, where data flow is automated for each layer and a new function is added for data enhancement as the layer progresses.

The implementation and data flow approach

Sources -> Ingestion ->Storage and compute -> Extraction -> Consumption

Healthark’s Role

We conducted the Business Analysis activities.  Most of the client interaction from offshore was managed by us which was essential to requirement gathering, SDLC documentation, accrual of client approvals, Backlog management and sprint planning using JIRA for maintenance of sprint boards, change and release management using SNOW. This support ensured the smooth functioning of upcoming releases, backlog prioritization and assisting the development team with any client inputs that would be required.


As the sprints are all overlapping in different phases, there is constant client contact which is pivotal to the smooth implementation of the project. Managing all such coordinating activities with the development team, testing teams and client stakeholders is necessary for the successful production deployment of the required data. Our team enhanced the processes and recurring activities during the sprint.


Share on