This framework integrates and aligns the people, processes, platforms and technologies in a rapidly digitizing world to uncover valuable business insights and gain a competitive advantage across industries. This platform aims at using data analytics to drive innovation and strategy and to improve efficiencies across functions. 

Part-I: Advanced Business Analytics Framework

The proposed framework (Fig.1) includes four complementary layers/platforms: Data platform, ABA platform, Information Presentation platform and Business Transformation platform. While having different focuses and serving different purposes, these layers are linked to each other and bridge the gap between strategic goals, machine learning algorithms, and data tables. The following steps/processes help in building a strategic ABA framework for significant success:

Identifying, assessing and prioritizing opportunities:

The opportunities for the ABA program will be identified and prioritized based on SWOT analysis and PESTEL analysis.

1.2       Building of an Advanced Business Analytics (ABA) program :

A systematic approach will be followed to develop and deploy an ABA program to catalyze the growth of the target business. This process includes the following nine steps:

(i)The High Impact Question (HIQ): Success or failure often hinges on the questions we ask throughout the project lifecycle. The HIQ considered for developing this framework is –

“What are the pain points and challenges with the system/process/product of the business?”

(ii)  Translating HIQ into Data Science Question(s) and Defining a Model: Data is only as good as the Questions you ask. The above HIQ will be translated to pertinent data science questions, such as, where will your data come from?,  how can you ensure data quality?, which statistical analysis techniques do you want to apply?, what ETL procedures need to be developed if any? who are the final users of your analysis results? what data visualizations should you choose?, what kind of software will help? etc. Based on the outcomes of these Data Science Questions, a suitable model will be defined.

(iii)Identifying the features/variables that could go into the model: Based on the HIQ, various features/variables that can go into the model are identified. The features of the model include(a) data features like data processing, data mining, data governance, Master Data Management (MDM); (b) analytics features like predictive/perspective analytics, (c) reporting features like real-time reporting, dashboards, alerts; and (d) security features like encryption, single sign-on etc. Similarly, numeric/categorical variables that can go into the model will also be identified. 

(iv) Narrow the variable list to a manageable number: Once the variables are identified, the list is narrowed down by identifying the most persuasive variables that could go into the model.

(v) Assess the availability and quality of the data (Data mapping): The data coming from various datasets will be migrated, ingested, processed and managed. The information obtained from multiple datasets will be merged into a single schema (table configuration) to query and derive insights from.

(vi) Data Acquisition Plan: Data required for building the ABA program will be gathered from both internal and external sources. Internal sources of data include the operational database of an organization where daily transactions are stored, while external sources of data include suppliers, customers, competitors, government agencies, the internet etc.5. The data collected is extracted, cleansed, transformed and loaded into Data Warehouses and Data Marts and then analyzed for decision making.

(vii) Build a model with training data: Four types of analytics goals will be implemented through this model. The data from data warehouse/ data marts are analyzed using Diagnostic/ Descriptive/ Predictive/ Prescriptive analytics. The techniques used in this program include Data Mining and Multidirectional Data Analysis. 

Data Mining: It is vital to choose an appropriate data mining model/algorithm, which in this case is either regression or classification. A suitable model is built with the training data, and the model will be refined to suit a particular industry.

Multidimensional Data Analysis: It enables managers, executives and analysts to gain insight into data through rapid, reliable and collaborative access to a wide range of multidimensional views of information.

(viii) Refining the model and the questions: The feedback from stakeholders will be used to return to the modelling step and repeat evaluation and deployment phases to refine the model further.

(ix) Deployment: The ABA program will be delivered in a format usable by the end-users and stakeholders. Various tools used for reporting and visualization include dashboards, reports, ad-hoc reports, alerts etc. Critical business data/processes will be presented in the form of graphic indicators, charts and tables (graphical representation) using Dashboards depending upon the management level. Reports/ ad-hoc reports are created by end-users. Alerts in the form of emails or SMS are automatically created when an event occurs.

1.3       Eliminating Bias in the ABA Program:

The ABA program will be carefully developed to eliminate all the six types of biases by choosing representative data sets, using the suitable model, regularly monitoring and reviewing the model and ensuring that the observers are appropriately trained.

1.4       ROI on data acquisition :

Even during a crisis, data and analytics can make a huge difference in your bottom line, finding new revenue, improving your supply chain, and reducing costs. The ROI will be measured based on the effect of the ABA program on costs & revenue, team efficiency, competitive advantage and speed-to-value.

1.5       Deploying /getting the most benefit from the ABA program:

The ABA program is expected to provide actionable insights in the form of information that can be acted upon or the information that gives enough insight into the future to enable decision-makers to function. It is expected that the organizations utilize reporting and visualization to make fundamental changes in how the business is run. This transformation includes personnel, processes, and technology. It helps the organizations compete more effectively, become more efficient, and provide the best-in-class customer services.

Fig.1. A step-by-step framework to establish, execute and evaluate an Advanced Business Analytics (ABA) program in any Business Organisation


This part of the document provides a comprehensive overview of the potential and possible applications of the ABA Platform, outlined in Part-I of this Paper, in the railway domain. I have chosen the Indian Railways ( for showing the applicability of this ABA framework for the railway industry.

2.1       Brief overview of Indian Railways :

Indian Railways, the lifeline of the Indian Nation, comprises a network divided into 17 railway zones. Being the third-largest railway network in the world by size, it accommodates and offers its services to people from all walks of life.  The safety of the millions of passengers that Indian Railways serve every day is of paramount importance to the system. Over the years, apart from the regular safety norms followed, the network has taken several steps through innovative use of technology and training of its workforce to enhance safety standards9.

2.2       The opportunity for Indian Railways through the ABA Program

Indian Railways has started its operations way back in 1853. Still suffers from multiple ailments. The task of the Indian Railways can best be captured in the words of the railway minister during the presentation of the 2016-17 Railway Budget: “Railway facilities have not improved very substantially over the past few decades. A fundamental reason for this is the chronic underinvestment in Railways, which has led to congestion and over-utilization. As a consequence, capacity augmentation suffers, safety is challenged, and the quality-of-service delivery declines, leading to poor morale, reduced efficiency, sub-optimal freight and passenger traffic, and fewer financial resources. This again feeds the vicious cycle of under-investment. This cycle must be put to an end…10.  The mission before the Government of India is to “reorganize, restructure, rejuvenate Indian Railways.

Indian Railways generates massive data every second of its functioning, most of which is manually collected. The numbers show why human intervention alone is not enough — the Railways carries more than 23 million passengers over a route of 66,000 km, passing through more than 7,100 railways stations and employing over 1.3 million people10. Indian Railways runs around 11,000 trains every day, of which 7,000 are passenger trains.9 The need of the hour is to apply Advanced Business Analytics to the Big Data churned out encompassing all aspects — millions of passengers travelling at any given time, ticket reservations, sales points of various items in stations; locomotives, passenger and freight cars, maintenance and service, weighing, loading, dispatching and unloading freight, vendor management, hundreds of thousands of staff at work across the country. Added to this is the safety and security of the passengers and the railway assets. If this is not a scenario that cries out for using ABA, it is hard to imagine any other, anywhere in the world, that does.

2.3       The High Impact Question (HIQ) for Railway domain:

Considering the most critical pain points and challenges with the system/processes/products of the Railway Industry, I have come up with the following HIQ for Indian Railways :

“How to increase the efficiency of Indian Railways manifold by addressing the pain points and challenges with its system/processes/products and make it 100% safe for commuters?”

2.4       Implementing the ABA Framework in Indian Railways:

It is proposed to use the ABA Platform outlined in Part-1 of this Paper (Fig.1) to help Indian Railways manage and optimize their performance, reduce downtime, and increase responsiveness and productivity. The implementation of the four layers of the ABA Program for Indian Railways in particular and railway industry, in general, is briefly described hereunder:

2.4.1       Data Platform and Data Acquisition :

The data required for the ABA program can be gathered from both the internal as well as external sources, which include (i) Infrastructure; (ii) trackside control-command and signalling; (iii) on-board control-command and signalling; (iv) energy; (v) rolling stock; (vi) operation and traffic management; (vii) railway yards (viii) maintenance etc. The sub-systems from which data can be acquired are classified as Fixed, Mobile and Organizational elements11:

a)  Fixed elements: the network, which is made of lines, stations, terminals, and all kinds of fixed equipment needed to ensure safe and continuous operation of the system;

b)  Mobile elements: vehicles travelling on the network; and

c)  Organizational elements: sub-systems, dealing with the functioning of the fixed and mobile elements.

These elements constantly exchange the data generated as part of normal operations (Fig.2). This data can be collected and managed using MDM, Data Governance and Data Mapping techniques for analysis using the analytics.

2.4.2       Advanced Business Analytics (ABA)Layer :

Big data and advanced analytics help railroad operators, network owners, and service providers leverage artificial intelligence, machine learning, Bayesian techniques and predictive/perspective analytics to transform data into a strategic asset12. The proposed ABA program (Fig.1) optimizes railway operations with real-time analysis of yard, crew, vehicles, passengers & freight and other intermodal data.

Predictive Analytics: By acquiring and analyzing data and by using statistical algorithms and machine learning techniques, the likelihood of future outcomes can be identified based on historical data13. Predictive analytics can be used for the predictive maintenance of safety-critical railway assets. A predictive algorithm based on machine learning which uses historical data to identify events that lead to train delays can be developed. When the system detects the same pattern, an alert is raised to the traffic controller to make timely interventions. Further, building and maintaining a risk model that can predict catastrophic accidents requires a lot of data, time and resources. The use of big data, supported by a team of data scientists and railway specialists, can support this process.

Prescriptive analytics: This is a new method of Railway maintenance. As scheduling has become more sophisticated, maintenance processes have evolved and improved. Recent studies indicate that most equipment fails on a ‘random’ basis no matter how many assets are inspected. In other words, failures cannot necessarily be correlated to how maintenance was performed12. By adopting a prescriptive analytics approach, safety levels could be increased, considerable additional savings could be achieved, operating ratio could be boosted while fuel consumption could potentially be decreased, and train velocity increased at the same time. All of these are key goals for any rail transportation provider, of course, Indian Railways being no exception. That’s why in a challenging and competitive sector like railways, prescriptive analytics is the key and represents the way forward.  

Machine Learning has a range of applications in the railway industry and can be utilised to advance the big data revolution in the context of the railway industry. Depending upon the requirement, one or more approaches will be utilised for the ABA Framework (Fig.1) under consideration. Appendix-1 shows a range of approaches of Machine Learning and other advanced methods, taken for the railway industry as reported by Alawadet al. (2019)17.

Fig.2. Elements of the Railway system

2.4.3       Information Presentation(Storytelling):

By tapping into data from passenger/freight movement, maintenance logs, sensors etc., data analytics can provide you with valuable organizational and operational insights, including (i) Asset tracking and management, (ii) Rail operations, (iii) predictive /perspective maintenance and (iv) Revenue13. In this layer, the reporting and visualization will be achieved through a series of dashboards, reports, ad-hoc reports, alerts etc.

2.4.4       Actionable Insights and Business Transformation:

From routing, sorting, and blocking cars, to scheduling and assigning locomotives, to smaller tasks like remote diagnostics and real-time monitoring of rail yards, every business decision and operation can be facilitated by machine learning and data analytics14. The proposed solution is a multi-tenanted platform that allows role-based access for users – from line staff and managers to senior management and board-level executives – on a unified platform. The solution integrated into the ABA platform allows the railway board to access the status of national operations on a virtual GIS dashboard. With this, the managers and officers at different levels of Indian Railways can access the information at the train or group level.

This ABA program serves as an enabler for real-time business transformation – analytics technology makes Indian Railways more agile in maintaining pace with ever-changing customer needs and business trends. 

2.5       ROI on the ABA Program for Indian Railways:

It is expected that with the implementation of the ABA program as per the outlined Platform in Indian Railways, it will see a 100% increase in digitalization, accountability, and ecosystem transparency, among other things. The availability of real-time information and robust alerts can improve performance by approximately 30-50%. Additionally, alerts for events or incidents that bring in location-wise information and intelligence allows staff to have early warning signals of the site performances, assets, and people they manage for faster response. All these can increase the net productivity of Indian Railways and, in turn, enhance the ROI on the ABA Program.

3.         Conclusion :

In this paper, an attempt is made to develop an ABA framework that can be applied across businesses and to demonstrate its use in a particular business case.  Such an example serves as a preliminary validation of the expressiveness of the framework. The framework proposed and the case study discussed in this paper did not involve any real work and real business stakeholder(s). As a result, the findings in this paper are primarily reported in the form of potentials that need further validation.

Sreenivasa Rao Ganapa


[1]     IBM. Business Analytics.

[2]     Ransbotham, S., Kiron, D., and Prentice, P. K. (2016) “Beyond the hype: the hard work behind analytics success,” MIT Sloan Management Review, vol. 57, no. 3, 2016.

[3]     Nalchigar, S and Yu, E. Conceptual Modeling for Business Analytics: A Framework and Potential Benefits. The University of Toronto. papers/Conceptual%20Modeling%20for%20Business%20Analytics%20A%20Framework%20and%20Potential%20Benefits.pdf[Accessed 19 April 2021]

[4]     Deloitte.[Accessed 21 April 2021]

[5]     Khan, R.A, Nadeem, A. and Ali, A. (2019). Business Analytics: A Framework. International Journal of Computer Technology & Applications, Vol 10(2),102-108

[6]     Profisee. [Accessed on 20 April 2020]

[7]     SAP. [Accessed on 21 April 2021]

[8]     Technopedia. [Accessed on 21 April 2021]

[9] Indian Railways. About the Indian Railways. view_section.jsp?lang=0&id=0,1 [Accessed on 22 April 2021]

[10]  Businessline (2018)Data analytics for smart railways. 20 January 2018.  https://www.

[11]  EUROPEAN UNION AGENCY FOR RAILWAYS (2016) Big data in railways: Common Occurrence Reporting Programme. Technical document. Document ID: ERA-PRG-004-TD-003.

[12]   Global Railway Review (2018). How prescriptive analytics can signal a new method of railway maintenance. Article: 19 October 2018. 74356/prescriptive-analytics-aspentech-rail/

[13]   Cloudmoyo. Rail bigdata analytics.

[14]   SAS. Predictive Analytics. What it is and why it matters. insights/analytics/predictive-analytics.html

[15] Galli, M. (2020) Deploying a Basic Model and Refining Methods.  https://www. (Accessed on 25 April 2021)

[16]  Colson, E. (2019) What AI-driven decision looks like. Harvard Business Review, July 08, 2019.

[17]   Alawad, H; Kaewunruen, S and An, M (2020) Learning From Accidents: Machine Learning for Safety at Railway Stations. IEEE Access. Volume 8, pp. 633-648. Digital Object Identifier 10.1109/ACCESS.2019.2962072


Examples of studies utilising Machine Learning and other advanced methods in Railway applications (Alawadet al., 2019)17

S.No.Railway ApplicationTechniqueSource Data
1Track geometryMachine LearningData collected using track geometry car
2Automated track inspectionDeep learning, CNNData collected by track inspection vehicle
3Estimating the probability of failure of network operationsMachine LearningLibrary and a database of historical incidents
4A decision support approach for rail maintenanceDeep learning,         Fuzzy inference modelVideo cameras
5Automatic visual inspection (object detection)Deep learning,           R-CNNImages by inspection vehicle
6Optic fiber sensing for track defenceMachine LearningTrain-borne system
7Classification of accident causesDeep learning,         Text miningHistorical accident reports
8Dynamic wheel defects on railway tracksMachine Learning,  SVM, CNNInstalled sensor system on the railway network
9Fatigue crack detection and sizingMachine Learning, ANNSimulation-based data set of signal response by the alternating current field measurement (ACFM) sensor
10Classification for rail defectsDeep learning,         Kernel principal component analysis, SVMLaser ultrasonic
11Monitoring of railway catenaries for railway infrastructure maintenanceBayesian networkHistorical inspection data
12Passenger safety on a railway platform that detects risks in stations in real-timeA vision-based object detection algorithm  Historical data records

Sreenivasa Rao Ganapa

Managing Director & CEO

L2M Rail
Eyes-on-the-yard Most of us associate railways with infrastructure to move people from one location to
Beam steering antenna and wideband up/down converters are essential building blocks for the deployment of
Structural Health Monitoring (SHM) is about the evaluation of the in-service performance of structures with