Mahanirban Calcutta Research group


 

Data Centres in India (October 2016 – April 2017)

Note on a Proposed Research Project on Data Centres in India (October 2016 – April 2017)

1. Data centres belong to the logistical world involving issues of infrastructure, software, and digital and computing labour. Driven by developments of information technologies, data centres carry forward the logistical argument of management of economy and polity to a new level, while remaining anchored in the overall logistical framework of politics, administration, and economics. They represent governance in one of its fundamental aspects, namely management of data and information. Data centres emerged at a specific juncture in the history of computing beginning with mainframes to personal computers and the client server model to distributed computing and software as a service. With increase of data transfer capacity, operational efficiency about classification, processing speed of structured and unstructured data, speed of data retrieval, and the development of technologies like cloud/on-demand, we may say that data centres in some sense have transcended the preceding phase of network of semi-autonomous computing devices. These dimensions are important to remember when studying the development of data centres in India, which is quite developed in software in the information technology related fields.

2. Data centres are not virtual existences. A data centre is a material reality with physical existence, concrete location, and concrete infrastructure. As such a data centre may be said to symbolise the combination of materiality and immateriality of information. In this context it will be important to enquire also as to how the institution of data centre has come to symbolise at the same time the centralisation and decentralisation of data management and governance.

3. Data centres started developing in India in the wake of the so-called Indian GDP revolution (7-9 % annual growth over several past years), which has meant more than anything else massive expansion of electronic services owing to expansion of volume of trade, insurance and other financial services (and thus massive increase in trade related data), a new digital sphere of functioning of regulating bodies like the SEBI (Stock Exchange Board of India), RBI (Reserve Bank of India), TRAI (Telecom Regulatory Authority of India), etc. All of these indicate there is an exponential increase in the volume of data in the wake of the expansion of trade and finance. This in turn requires ensuring robust practices of achieving new levels in data integrity, data safety, adoption of latest standards (such as ISO 27001: 2013 in information security management), management of comprehensive data set, and the accompanying ability to analyse information. Data centres often appear as symbols of the contemporary world of mass generation of personal data as part of everyday digital processes, and their convergence through global digital identities, such as those offered by Google and Facebook accounts. They denote development of greater integration of data architecture, the best example of which in India is the information generated through the Unique Identity Project to be integrated (and interchanged) with other stored data (such as financial data on bank accounts, recipients of subsidies, etc.). It means that task planning will be easier impacting on time planning as well. It will also mean rapid expansion of data storing, processing, analysing, retrieving, and transmitting leading to economies of scale. But this also implies increasing amount of unstructured data being handled, because of greater processing capacity. This is the basis of big data phenomenon, which curiously co-exists with sparse data syndrome. For instance, unavailability of granular and/or interoperable data, as well as context specificities of machine learning, may lead to lack of warning for instance in a situation where there is a requirement for immediate distribution of food grain in the country in order to arrest price rise or prevent hunger, or a requirement to have flexible warehouse utilization programme. These paradoxes are evident in the Indian scenario.

4. Several data centres (93) located in different parts of India (https://cloudscene.com/market/india/all) provide data centre services including dedicated server hosting, managed and unmanaged services, and co-location services. These data centres are usually spread across substantial chunks of land and have high capacity servers. Because of cost-effectiveness, the data centres are especially favourable for start-up industries in the country, though these centres serve also clients abroad catering to their IT infrastructure needs such as data storage, data security, and interconnection. Real estate business firms, media and video streaming firms, IT and ITES companies, bulk messaging industry are often the clients, though the manufacturing industry too needs the data centres. These data centres represent technological advances in IT which ensure higher speed, greater power and capabilities with regard to data and IP communications including storage, retrieval, and transmission. High-speed global communications network and services are the raison d’ętre of the data centres, which are crucial for transmission of critical data at nearly the speed of light to where it is needed anywhere in the world. They provide the clients with fast, reliable IP communications and support the clients with services that are crucial for effective and efficient storage and multi-media services on social media like voice and mobile signalling, cloud, big data, etc. These service providers cope with rapidly changing market dynamics, flock towards emerging market growth opportunities, and are in perpetual hunt for end users needing communicating across multiple channels. In this way, digital evolution and the existing landscape of business mutually determine each other. Data centres also provide dedicated platforms that ensure privacy and security of the clients with synchronisation across data centres that ensure business continuity. Business firms can co-locate on their IT equipments and thus acquire cost-effective alternatives to building their own infrastructure. Co-location services provide regulated power, cooling and physical security for the server, storage, and networking equipment and allow enterprises connect to a network service provider of their choice., plus shared rack, dedicated rack, caged space, remote hands service, customer workspace, and reporting service. Clearly data centres in India are a mark of growing business environment in IT related particular fields. Their lives are related to trade cycles. They mark the centralization of the IT and ITES business and demonstrate the logistical dimension of the IT infrastructure. They create their own logistical territories. Belapur in Navi Mumbai may be considered as an instance of the aspects mentioned above and below.

5. Data centres also embody the risks, leakages, and the breaking points of the global communication apparatus. Concerns of fire hazard, piracy, business slowdown, imperfect installation, big data missing out certain crucial particularities, etc., drive data centres to develop disaster recovery planning and business continuity planning – something which calls for greater coordination of several authorities. Given the question of cost recovery in post-2008 uncertain times, management of risk implies consideration of quality, cost, base factors, and systemic solidity. These measures also include ensuring information security, restrictions on software installation, security policy for supplier relationships, response to incidents impacting on information security, and other steps required as new controls under ISO 27001: 2013. These measures suggest that risks have become part of normal planning. They also suggest greater reliance of environment, which will include among others greater public oversight.

6. Indian experience prompts us to inquire if data centres can be seen as markers of a new mode of governance. While in information management some of the old modes of governance continue, data centres indicate new modes of governance. These new modes of governance are yet to fully develop, but we may find the rough outlines through new government initiatives in data management. On one hand it means more centralized handling of data, on the other hand it offers scope for decentralized or dispersed handling. It also means more data centric management of public life. More importantly, data management in India does not belong to purely private domains of data service providers and IT giants. Data governance draws from experiences of the postcolonial Indian state in dealing with the society, population groups, security needs, welfare needs, and territorial management. We can refer to the huge volume of data generated, processed, interfaced, retrieved in India in the context of both specialised public data collection institutions like the Office of Registrar General of India that conducts the decennial Census and the National Sample Survey Organisation on one hand, and sectoral data collection initiatives undertaken by national agencies and progammes like the Reserve Bank of India, various public sector banks, Securities and Exchange Board of India, National Crime Records Bureau, and Rural Employment Guarantee scheme on the other, besides of course the Unique Identification Authority of India that offers verification of identity as a service to other government and private agencies. The NIC (National Informatics Centre) has set up National Data Centres in Delhi, Pune, and Hyderabad, and 30 small data centres at various state capitals data centres. It also operates the open government data platform (https://data.gov.in/). Even a global company like the IBM has set up a public data centre in Chennai to tap into government initiatives like Digital India and Smart Cities. The data centre will also reportedly enable the company to tap into data sensitive sectors like banking, financial services, and telecom that often mandate that data be hosted in local data centres.(http://www.thehindu.com/sci-tech/technology/ibm-sets-up-public-data-centre-in-chennai/article7757332.ece)

7. Public management of data is more and more geared towards interface of various kinds of data in what is called public interest. In India, highly developed in terms of keeping records, and with a long history of census keeping and census analysis as well as banking industry, the data management infrastructure draws a lot from the state history and state capability. Any study of the data centres in India has to take the twin factors of state capacity and state regulation into account. Such a study may also show how the state capacity may change or acquire new abilities or become dependent on private abilities in governing the informational world and shaping its own logistical ability. It is in this background that we have to critically analyse the nature of the regulatory regime of e-governance in the country, including data protection provisions, the Data Security Council of India, and the Information Technology Act as a whole. This is important in the background of a large number of BPOs in India with access to large amount of data – commercial and personal.

8. This in turn has acquired a critical dimension, namely public-private partnership (PPP) in data management. Even though the model of PPP precedes the growth of data centres, the field of data management and the broad growth of IT infrastructure are inconceivable without the PPP, whose aspects may range from land allotment, tax rebates, collaboration in public data management, to budgetary provisions, and downright allowing access to commercial interests - public and private. One may say with the experience of Aadhar that the market of interlinked data management is taking form through public-private partnership. Trade journals, relevant policies, and other aspects of e-governance are windows to understand the dynamics of the PPP in data management in the country.

9. Still questions remain as to (a) the unknowable nature of certain risks (breach, piracy, and other risks mentioned in paragraph 5), (b) hazards of trade cycles and crisis like that of 2008, (c) limits of the context for which the data is geared, (d) the contradiction between the openness of the source (let us remember that much of the vital technological are driven by free and open source software communities) and the private nature of the holdings, (e) possibilities of abuse (such as corruption as happened in the US or directing the data to “other” purposes), (f) ignoring the human factor, and (g) related to the earlier point, the question of labour in the entire gamut of data industry – from collection to end use.

10. This proposal seeks to study data centres and some of the select dimensions of data management in India in six ways:
(a) In the context of the data centres, inquiry into the evolution of public data management in India in terms of regulatory mechanisms and standards – the continuities and discontinuities; [one of the principal issues here may be how much of the public management of data contributes to building of social infrastructure like schools, public warehouses for essential commodities like food grains, flood control measures, etc.]. In this context chronicling the history of the management of public data, such as Census data / National Sample Survey Organisation (NSSO) data / organization of public sector banking data; this can also focus on a study on data collection for one of the key national schemes in India like NREGA (National Rural Employment Guarantee Act) and National Rural Health Mission. Such a study can take the form of ethnography of the data collection process, the reality at the ground level, and how the state-citizen relationship is being shaped by such data sharing experiences. This can also focus on what can be called “data collection labour”, and the shift from amins (revenue officials, measuring, collecting, and certifying land related claims and documents) and census surveyors to the new forms of data collection workers on the ground.
(b) Mapping Belapur in Navi Mumbai or one privately owned data centre somewhere in India (Hyderabad) as an instance; Belapur houses a cluster of data centres of various financial institutions. A study of the financial growth of the area, the profile of customers, land utilization, nature of workforce may give an idea of the way the immediate and expanded territorialities of a data centre may materialize.
(c) Narrating a state government data centre in West Bengal that will tell us of the informational fields of local trade, health, taxation, security, and most importantly land, land revenue, and land utilization records – the lifeline of a vast number of people. State Data Centre (SDC) has been identified as one of the important elements of the core infrastructure for supporting e-Governance initiatives under the National e-Governance Plan (NeGP). Under NeGP, it is proposed to create State Data Centre for each State to consolidate digital services, applications, and infrastructure to provide efficient delivery of electronic services by the States through common a delivery platform seamlessly supported by core Connectivity Infrastructure such as State Wide Area Network (SWAN) and Common Service Centre (CSC) connectivity extended up to village level. With these shared service centres, implemented and managed by a competent Data Centre Operator (DCO), individual departments will be able to hopefully focus more on service delivery rather than on issues of infrastructure. In West Bengal as we know, already a four storied building with floor area of 5000 sq. ft in the Webel Campus, Salt Lake, Kolkata has already been constructed and is ready to be handed over to the Data Centre Operator (DCO). At present the SDC is housed in the 1st floor, covering 4000 sq ft. An additional 1000 sq. ft of area in the same floor is earmarked for further expansion of SDC. The State Switching Centre of WBSWAN covering 1000 sq. ft area will be housed at the 2nd floor and the rest portion will be used for the future expansion of the SDC. The State Data Centre Architecture comply with the SDC Scheme and meet the Tier II specifications ( SLA>99.749%) with the total area of 4000 sq.ft. and server farm area of 1500 sq.ft. (http://www.webel-india.com/state_data.html). Further, the State Data Centre has to be studied as a key institution stuck between the complementing and competing informational aspirations of the Union and State governments in India.
(d) Study of the Unique Identity of India Authority (Aadhar project) as instance of public-private partnership in generation and management of data about transactions between the residents and various public and private entities, at the national scale, and the shifting modes of public-private data management it imagines and engenders;
(e) Study of the relation between data centres in India and those across the east and south-east Asian region;
(f) And, finally issues of data secrecy, government guidelines on IT infrastructure, duplication, piracy, etc.

11. These studies should reflect on issues of IT workforce or broadly speaking labour in information industry, IT locations including land use patterns, security and surveillance, backup arrangements, interlocking arrangements between data centres, IT firms, business firms, banks, financial companies, and the public authorities - along with complimentary issues of decentralization and centralization of data management.

12. The study should take about 14-16 months. This is the most challenging in the three phase inquiry, and therefore should be as far as possible be housed in CRG to ensure that the work can proceed as per the plan. CRG will of course take the help of other experts and institutions, and collaborate with them.

13. The study will be in tune with two of the most important segments of the main statement of the research agenda:
(a) Communication of results: (i) Development of an original methodological and analytical framework for studying the impact of data centres on the governance of the digital economy, and how best to devise relevant policymaking; (ii) Increased understanding of regional patterns of data storage and mobility with an emphasis on the implications of Asia-Pacific data infrastructures for Australian culture, policymaking and practice; (iii) Provision of an interdisciplinary framework within which humanities and social science scholars can interact with software designers and data analysts to develop digital methods; (b) The academic outputs of the Project will be research monographs and peer reviewed articles to appear in venues such as Theory, Culture & Society or South Atlantic Quarterly or, Big Data and Society and an image-rich edited volume published with an architectural press such as Sternberg or Actar, and website documentation of research - resulting in increased critical knowledge of political and social impacts of data centres, relating particularly to areas such as planning and zoning, regime of labour, and data governance.
(c) Specifically the tasks will mean: (i) There will be a policy report that can be used for dissemination on the project site and beyond that covers the scope of topics and issues raised in the proposal. This report will draw connections between India and Singapore, Hongkong, and, where possible, Sydney [month 16]; (ii) Four 2000 word pamphlet style texts will be authored and these will be posted on the project site [months 6-12]; (iii) One peer-reviewed journal article or book chapter that will provide a critical appraisal of data centres, governance and state transformation in India [submitted by month 16].

 
This is a collaborative project with Western Sydney University. The work continues in the wake of the attention devoted by CRG and University of Western Sydney to the problematic of logistics as a governing mode in contemporary capitalism. The project sought to study data centres and some of the select dimensions of data management in India in five ways: (a) Inquiry into the evolution of public data management in India in terms of regulatory mechanisms and standards (b) Mapping Belapur in Navi Mumbai that houses a cluster of data centres of various financial institutions(c) Examination of a state government data centre in West Bengal that will tell us of the informational fields of local trade, health, taxation, security, and most importantly, land, land revenue, and land utilization records (d) Study of the Unique Identity of India Authority (Aadhar project) as an instance of public-private partnership in generation and management of data about transactions between the residents and various public and private entities, at the national scale, and the shifting modes of public-private data management it imagines and engenders(e) issues of data secrecy, government guidelines on IT infrastructure, duplication, piracy, etc. The research was conducted under the guidance of Ranabir Samaddar with Ritam Sengupta as the Programme and Research Associate.

 

Researchers of the Project

  • Brett Neilson 

  • Ned Rossiter

  • Ranabir Samaddar

  • Manish Kumar Jha & Rishi Jha by Data, Society and the City : Technology, Territory and Population

  • Ritajyoti Bandyopadhyay by Towards a History of Data at the Time of Big Data

  • Ritam Sengupta by Counting loss: An ethnography of recently introduced data management schemes in West Bengal, India, and Mobilising Data, Localising Economies: Data Protection as a governmental concern in contemporary India