Online Research Resources for South Asian History
Summary and Keywords
Because so much of South Asia’s archival and primary source materials as well as precolonial and colonial-era published sources traditionally referred to by historians reside in physical archives and libraries that are difficult to access, the work of individual historians until recently had often been limited to resources they could access only from significant collections outside of South Asia, such as those at the British Library and at some major US research libraries. Research travel to South Asia to consult domestic collections there has always been expensive, impractical, and too often an exceedingly challenging endeavor because of the local limitations on access. But with the growth of the internet since the 1990s, and the relative ease of putting materials online, there has been an explosion of small- and large-scale efforts at digitization and online publishing of more unique and previously inaccessible treasures from South Asia. As of the early 2000s, a wealth of valuable open-access as well as commercially produced and distributed content is available online to scholars of South Asian history.
However, this profusion itself has created new challenges. The lack of selectivity, peer review, or other quality evaluations for much internet publishing, the dearth of standards for long-term website continuity and presentation, the absence of centralized pathways for structured discovery of these resources, the bewildering array of user interfaces, the increasing monetization of online access to primary source content, and the inadequate attention to digital preservation all make this universe of digital content a far from ideal setting for historical research. To enable historians more effectively to identify authoritative online sources that meet their research needs and how to access them, collaborative endeavors by South Asia librarians and academic institutions are beginning to yield useful results and to create orderly oases in the general chaos of the internet.
South Asia, as an object of academic study and research, is generally understood as a geographic region comprised of the modern nations of Bangladesh, Bhutan, India, Maldives, Nepal, Pakistan, and Sri Lanka. Much has been written about the colonial origin, history, meaning, and extent of reference of terms like this, as well as the “Indian subcontinent.”1 What about Burma to the east? Afghanistan to the west? Tibet to the north? These questions resonate differently for scholars in different fields, such as political history, than they might for citizens of these nations. For purposes of this article, South Asia is defined in strictly geographic terms, highlighting digital resources in support of historical research that either explicitly point to one of those countries or self-identify the scope of their coverage as “South Asia” per se.
This region is vast in scope, deep in the chronological extent of its recorded history, and tremendously complex in the linguistic, ethnic, religious, political, economic, and cultural granularities of its civilizations. Because peoples from South Asia have settled in almost every country of the world, this centuries-long and continuing diaspora is frequently considered an important object of study within the academic discipline of South Asian history. Primary and secondary sources to support the study of South Asian history, in the form of manuscript, printed and microfilmed materials, abound in their millions in libraries, archives, museums, and private collections throughout the region, in the archival institutions of the region’s former colonial powers (such as the British Library and the Biblioteca Nacional de Portugal), and in research libraries around the world, which have been explicitly collecting these materials, in some cases for centuries. The enormity of such holdings is illustrated by the magnitude of a published work intended to briefly list just the libraries in India itself2—it is 16 dense volumes in length, and very incomplete!
The vastness of holdings in South Asian libraries and archives, however, is offset by the long history of dismal preservation and archival practices, chaotic and inconsistent systems of discovery and access, and the generally scant investment in their development and upkeep. Most publications from South Asia are produced in print and have never been digitized, and even finding and identifying print holdings in South Asian libraries is daunting because of the lack of online catalog records. During most of the modern history of the field of South Asian studies, scholars attempting to conduct research by accessing these holdings in South Asia have found them notoriously difficult to work with, if they can get to them at all.3
In fact, historians from South Asia often find (and frequently lament) that some of the best preserved and most discoverable and accessible resources on South Asia can only be found outside the region (e.g., at the British Library)4 and in the academic research libraries of American and European universities with long histories of strong programs in South Asian Studies.5 Many of these institutions engaged for decades in collaborative efforts to preserve and make accessible South Asian content, initially through large microfilming consortia,6 themselves the forebears of modern digitization projects.
In the early days of the internet, before the advent of the World Wide Web, a small number of institutions and governments began providing online resources, texts, and databases with content on South Asia. These were scattered and hard to find, with no centralized discovery available. As the number and variety of such resources slowly began to grow, a way to integrate discovery and access for academic research purposes became more and more important. As with collection development of printed materials, it fell to librarians to devise ways to organize and integrate selected collections of online content to make them useful to scholars.7 An early, preweb example of such integration for South Asian Studies was the South Asia Gopher, created and maintained for some years by Columbia University.8 However, large-scale production of both born-digital and digitized-from-print resources online did not commence until the World Wide Web was invented and achieved widespread visibility and utility. An early effort at integration for the burgeoning web resources on South Asia (which were initially primarily from academia and government agencies, and only later from commercial online publishing ventures) was the successor to the South Asia Gopher—SARAI (South Asia Resource Access on the Internet),9 which assembled vast, hierarchically organized collections of annotated links to hand-picked online materials for this field, from the mid-1990s until about 2007.
Despite the fact that South Asian published and archival print resources have been relatively slow to appear online10 (with the vast majority of those holdings still untouched by the digital age), the 21st-century scholar of South Asian history can access online a much richer array than ever before of primary and secondary source material, ranging from individual texts, maps, and images digitized and put online by individual scholars, to large and diverse digital libraries hosted by individual universities, archives, and research centers, to efforts by international consortia at open-access online publishing, to expensive primary source collections for sale or subscription by large commercial ventures. But a key problem today—as in the early days of the web—is the lack of centralized discovery and access, the profusion of formats and the absence of widely accepted standards, the lack of attention to digital preservation to ensure long-term access to content, and the instability and short life span of many objects placed on the web. Despite good efforts by university-based portals, hubs, LibGuides, and other subject-structured, curated link collections (like SARAI), these initiatives tend to become unmanageable due to their inability to scale up in concert with the explosion of new content appearing online every day (and the vast number of previously accessible websites quietly disappearing or changing location on the web each day),11 as well as the inconsistent institutional investments in maintaining them online. While commercial publishers have produced some very valuable South Asia research content online—presumably with somewhat better stability than “web freebies”—they don’t address the problem of integrating discovery: their search engines understandably only highlight the content they produce and sell themselves.12
It is far beyond the scope of this article to point to all the useful online resources on South Asian history. Instead, a sampling of noteworthy efforts is described as exemplars of different categories and the challenges they face, including online archives produced by national libraries and government-supported agencies, university-based open-access repositories and archives, open-access digital library consortial projects, digital productions, and databases and publications by scholarly societies, research centers and NGOs, and commercial online publishers and aggregators.
National Library and Government Online Archives
Many national, state, and municipal governments in South Asia maintain print archives of government documents and reports, gazettes, maps, manuscripts, local newspapers, and other published and unpublished works essential for historical research on the region. These are historical treasures of great value to the scholars who manage to go there and access these holdings onsite. Although many of these collections also have some online presence, however, their websites primarily give information about the organization or department rather than the content of the archive itself. Some of them also provide online indexes and catalogs of their holdings.13 Government and foundation-sponsored projects have attempted to preserve some of the holdings of these collections through digitization,14 but for the most part, even when digitization has taken place, what has been digitized has not so far been served on the web for open access by scholars. Therefore, as a whole, national and government libraries of South Asia have not yet become a significant source of online content for historians, who generally must still travel to these collections and attempt to use them in situ.15
Outside the subcontinent, the most notable national library collection that is actively digitizing its South Asia riches and putting them online is the British Library (hereafter BL). Most of the official record of the British Raj and the East India Company resides there, including the old India Office Records and Private Papers, as well as a wealth of colonial-era publications and other materials collected from India itself. A great portion of these collections is not held in libraries anywhere else in the world (including South Asia). An example of the kinds of indispensable historical materials produced by the British Raj is the vast set of “Native Newspaper Reports,” also known as “Vernacular Press Reports.” These were weekly and monthly English summaries and translations from local-language newspapers throughout India, produced for the British colonial administration from the 19th through early 20th centuries. Because of the unique breadth and depth of all these research materials on South Asia held at the BL, for decades, serious scholars of modern South Asian history have travelled there to conduct their research. Although this pilgrimage to London is still essential, the BL itself is increasingly engaged in digitizing parts of its South Asia holdings and putting them online, often for open access by scholars around the world.16 (Some of the BL’s digitization projects have been turned into commercial products marketed by Adam Matthew Digital and are not available for open access. See, for example, the BL’s press release and Adam Matthew’s product marketing for “The East India Company: Rise to Demise“).
The BL is also partnering with libraries and archives in South Asia to fund local digitization of important holdings and putting them up on BL servers for sustainable open access, under its extensive Endangered Archives Programme (EAP). Scholars can browse to locate the dozens of individual South Asia collections digitized under the EAP—each containing dozens of distinct resources organized by file name—from the EAP’s projects page. Examples include everything from historical ephemera and manuscripts from Nepal, to early Malay writings from Sri Lanka, to records of 19th to 20th century Tamil village judicial assemblies, and publications of French India from 1800 to 1953. The primary purpose of the EAP project is the digital preservation of otherwise endangered “material of ‘pre-industrial’ societies that is in danger of destruction, neglect or physical deterioration,”17 and it fulfills this purpose excellently. The range of unique materials that have been rescued from oblivion by the EAP is very impressive, but many have noted that the contents, though rich, are scattershot, not easy to navigate or search, and difficult to use in systematic research. Although the metadata for each item is searchable through Google, there is no thematic or subjectwise organization of these holdings and no search-based discovery interface.
Another example of a national library that has undertaken some interesting digitization from its South Asia holdings is the National Library of Scotland (NLS). The India Papers Collection of the NLS rivals that of the BL, and they have highlighted their unique Medical History of British India collection by digitizing it and presenting it online. This collection, which consists of official publications varying from short reports to multivolume histories related to disease, public health, and medical research between 1850 and 1920, provides invaluable source material for the reconstruction of the history of disease and medicine in British India. Although a large section of it has an all-India scope, the collection is especially rich in documents related to Bombay and the Punjab. The Bombay plague of 1896–1899, one of India’s severest outbreaks, is particularly well covered.
The US Library of Congress is another national library with deep historical collections from and about South Asia. Among its relevant open-access digital productions with South Asia content for historians is the World Digital Library (WDL)—a collaboration with museums and libraries to create a central discovery and access portal for select primary source materials (and “cultural treasures”) from contributing institutions around the world. The WDL’s South Asia holdings consist of several hundred items, primarily published monographs, historical maps, prints, drawings, and photographs.
University-Based Open-Access Repositories and Archives
Many university research libraries in South Asia and other parts of the world hold impressive collections of historical materials from and about the region. Typically, the best of these materials are primary source resources held in rare book and special collections and are not available through interlibrary loan or other collection-sharing mechanisms. In general, because of restricted access regimens, these materials can only be consulted by visiting these libraries. Recently, however, there have been many efforts to make some of this material more accessible to scholars through selective digitization and open-access presentation online. These digitization initiatives are supported by foundation or government grants, private donors, or by the universities’ own resources.
Granth South Asia is a foundation-supported digitization project of the School of Cultural Texts and Records of Jadavpur University, India. It is engaged in digitizing many of its thematic collections, for example, the Bengal Chamber of Commerce and Industries Collection, which will be of interest to scholars of the economic history of Bengal. They also have a listing of many other bibliographic and full-text projects and databases targeting Bengali literary and historical sources, slated for online presentation. Osmania University Digital Library (Hyderabad, India) has digitized 45,000 documents from its collections for open access on the internet, but, like similar efforts at other South Asian universities, long-term stability of digital access is not assured, and as of this writing, the content seems to have gone dark on the servers of both the Digital Library of India and Osmania University itself.18
Outside South Asia as well, selective parts of university library collections are being digitized for open access. For example, the Centre for South Asian Studies at the University of Edinburgh has created a thematic research and teaching website entitled “Mutiny at the Margins: The Indian Uprising of 1857.” The site is rich with bibliographic citations, links to external resources, and numerous digitized full-text reports, historical texts, documents, correspondence, and other primary and secondary sources.
The University of Texas at Austin digitizes thousands of maps from its Perry-Castañeda Library Map Collection, which includes many dozens of detailed historical maps of “India and Adjacent Countries” from the early 20th century. Browsing their “Historic Maps of Asia” section yields a number of 18th- and 19th-century maps of India as well.
Another example of a university library that is actively digitizing South Asian historical materials for online open access is the University of Heidelberg. Their Digital Library’s section on digitized historical literature has a whole subsection devoted to literature on South Asia.
Beyond these kinds of thematic collections of digitized sources, many university libraries have projects that digitize individual books or runs of journals, in response to research needs of particular scholars, in organized on-demand “scan-and-deliver” services, or as part of general digitization programs. Those libraries engaged in this kind of digitization, while not targeting South Asia in any thematic way, still do sweep up many relevant historical resources in their online content, often presented for open access when the content is in the public domain. For example, given the strengths of Harvard’s library collections on South Asia, it is not surprising that it has digitized some journals such as its run from 1825 to 1828 of the Report of the Ladies’ Society for Native Female Education in Calcutta and Its Vicinity, or individual books such as Edwin T. Aktinson’s Notes on the History of the Himálaya of the N.-W. P., India, (1883).19
A number of universities have engaged in a different sort of digital endeavor that creates enhanced access to historical content for scholars: the creation and linking of online data archives. For example, Harvard’s Institute for Quantitative Social Science (IQSS) has worked with many partner institutions to develop The Dataverse Project (an open source research data repository system). This endeavor allows partner libraries to assemble or create datasets from diverse sources (including data extracted from historical print sources) and to make them discoverable and accessible in a linked open framework. While most of the data are contemporary in nature, a number of historical data sources have also found their way into this system. For example, via Harvard’s Dataverse, one can discover datasets in the rich collections of the ICPSR (Interuniversity Consortium for Political and Social Research, based at the University of Michigan). Among many sources of data here are a few sets of South Asian historical data compiled from printed sources and converted into online data for analysis. These would include South Asia specific datasets (such as Rural Development in Deccan Maharashtra, India: Village Panel Study, 1947–1977, by Hemalata C. Dandekar and the Gokhale Institute of Politics and Economics), as well as multicounty comparative historical data that include South Asia (such as Workplace Ethnography Project 1944–2002, by Randy Hodson, or Correlates of War Project: International Trade Data, 1870–2006, by Katherine Barbieri, Omar Keshk, and Brian Pollins).20
From the scholar’s perspective, the problem with all these kinds of one-off resources is that the content—and its associated metadata, if any—are not aggregated for discovery into a single organized collection where one can find them. Some of them have local catalog records with links to the full text and with the record also loaded into the Online Computer Library Center (OCLC)/WorldCat, where they can be discovered by anyone using standard search types (author, subjects, etc.). Others are not loaded in WorldCat. Some digital libraries have metadata, robots.txt files, or other server settings that allow their contents to be crawled by global web indexes like Google, enabling one to find their contents through Google keyword search, but others are set up to prevent such crawling and indexing. This lack of consistency means that discovering these resources is difficult, unless you already know about them.21
Online Resources From Scholarly Societies and Research Centers
Beyond national libraries and university libraries, scholarly societies and research centers have also been significant players in producing and publishing online content of use to historical research on South Asia. Sometimes they are able to create open-access repositories and, in other cases, access to their online resources is restricted to their members or open for fee-based access on a nonprofit cost-recovery basis. For example, although the French Institute of Pondicherry—affiliated with the Ecole française d’Extrême-Orient, Paris—has research focus on Indology, environmental, and social sciences, they have also produced an online Historical Atlas of South India.
The Dhananjayrao Gadgil Library of the Gokhale Institute of Politics and Economics, Pune, maintains an open-access digital institutional repository that is rich in primary source material on Indian economic and political history. Some other South Asian research centers do not have the resources to mount their own digital repositories online, but work with outside partners (such as the British Library’s Endangered Archives Programme) to digitize and present their historical materials. Examples would include the Centre for Studies in Social Sciences Calcutta, the Roja Muthiah Research Library in Chennai, the Madan Puraskar Pustakalaya in Kathmandu, the Sundarayya Vignana Kendram in Hyderabad, the Mushfiq Khwaja Library and Research Centre in Karachi, and many others.22
In the UK, the Wellcome Library (of the Wellcome Trust, London) has digitized many thousands of its books and manuscripts for the study of medical history and has particular strength in its coverage of South Asia. For example, a simple search for “India” in their online catalog, limited to full-text resources, reveals that they have put online more than 17,000 open-access Indian e-books on medical and social history topics, mostly from the 19th and early 20th centuries. The Wellcome Collection of historical images contains more than one thousand searchable South Asia-related images from historical library materials and museum objects.
In Paris, the project on Brahmaputra Studies: Languages, Cultures and Territories in Northeast India is a joint production put together by scholars associated with the École des Hautes Études en Sciences Sociales (the EHESS—the School of Advanced Studies in Social Sciences) and the Centre National de la Recherche Scientifique (CNRS). It contains a focused collection of resources for the study of this region of India, including a “corpora” section with numerous digitized full texts, maps, and bibliographic resources of interest for historians, assembled from a number of archives, libraries, and other internal and external sources.23
The Council of American Overseas Research Centers—CAORC—is an umbrella organization based in Washington, DC that helps support American academic research centers in countries around the world, including in five countries of South Asia: the American Institute of Bangladesh Studies, the American Institute of Indian Studies, the Association for Nepal and Himalayan Studies, the American Institute of Pakistan Studies, and the American Institute for Sri Lankan Studies. Each of these overseas research centers has its own library and access to local research resources and has contributed content to CAORC’s Digital Library for International Research—DLIR, including records for their international union catalog, a directory of local libraries and archives, and a small number of online full-text and image resources.24
Of course, there are many more small, local societies and associations with specific thematic focuses who are putting interesting resources for historical research on their websites. For example, see the British Association for Cemeteries in South Asia, the blog and home website of the Shahid Bhagat Singh Research Committee, and the Marxists Internet Archive, which has a section on Marxism and Anti-Imperialism in India, containing full-text transcriptions of the writings of figures in the history of the South Asian left, such as M.N. Roy, Shapurji Saklatvala, and many others.
The Panjab Digital Library is another example of a grassroots, crowd-funded, volunteer-powered noble digitization effort that has managed to amass an impressive collection of open-access online content, including much local historical material. Of course, all these kinds of organizations, lacking permanent institutional homes or ongoing infrastructural support, and not having the professional expertise of librarians and archivists at their disposal, are maintained on a shoestring budget and are unlikely to be able to preserve their digital content into the future. Likewise, their crowd-sourced records and contributions cannot encompass the bibliographic metadata or detailed cataloging that would be necessary for research-appropriate discovery and navigation, even in the present. Just as a trivial example, the Panjab Digital Library’s record for a book on Master Tara Singh lacks any specification of author, date, subjects, or even standardized romanizations or spellings of names, etc. Compare that, in terms of searching and the rigors of academic research, to the WorldCat record for the same book as cataloged by the Library of Congress. But the Panjab Digital Library does let you read the book online, once you manage to discover it!
Beyond digitization and digital library projects from universities, scholarly societies, and research centers, many of these organizations are also publishing e-journals with original historical research. There are commercially distributed university press e-journals, as well as open-access e-journals by research centers and societies—some peer-reviewed and of high academic quality, while others may be more marginal.25 Likewise, there are thematic e-book collections published by university presses, such as the New Cambridge History of India published by Cambridge University Press.
Open-Access Consortial Digital Libraries
A significant limitation on the ability of libraries and archives to turn their print collections into useful online digital libraries is the amount of technical expertise, infrastructure, and resources they would have to deploy to do it properly. Scholars often misunderstand what it takes to “put research materials online”: the assumption is that if you’ve got a scanner you can “digitize” anything. What this expectation fails to account for is that digitization (creation of digital images) is actually the least difficult, complex, or costly component of generating digital libraries for research.26 The real challenges come after you have the images: What about copyright? Are they legal to distribute online? Where to put them? How to serve them? How to preserve them for the future? How to ensure their long-term stability online at a consistent URL? And especially, how to create appropriate metadata to make them discoverable, findable, and usable? These are all questions that take expertise, infrastructure, internationally accepted standards, and substantial resource investment to address. These issues, if not addressed, can yield scattered, transient, unseen individual digital objects or sets of objects that do not contribute to the overall viability of online capacity for research. Among the resources outlined in the sections on digital content produced by governments, universities, research centers, and scholarly societies are numerous examples of well-intentioned digitization initiatives that fall into this category.27
A productive approach to addressing the challenges of digital library creation is to pool human, technical, and material resources among a community of stakeholder institutions in order to create a consortial project. One of the mostly widely known and successful projects along these lines is DSAL—the Digital South Asia Library. Initially created in the late 1990s by James Nye (bibliographer for Southern Asia, University of Chicago) and David Magier (then director of Area Studies, and South Asia librarian, Columbia University), and hosted ever since on the servers of the University of Chicago, the DSAL aimed to digitize target materials from libraries and archives in the United States, Britain, and South Asia and to make them discoverable and accessible through open-access presentation, extensive full-text indexing, and links to the DSAL promulgated through the wide network of participant stakeholder institutions. Supported with initial grants from the Association of Research Libraries, the Mellon Foundation, and the US Department of Education (see proposals here and here, respectively), the DSAL expanded the scope of its digital resources, encompassing high-quality digital editions of major reference works, such as the Imperial Gazetteer of India, reports from the Archaeological Survey of India, historical atlases, Statistical Abstracts Relating to British India (presented in e-book form and also statistical tables downloadable in Excel format), numerous union catalogs and bibliographies, digitized images of historical maps, historical runs of newspapers and journals, and many others. As the breadth and quality of the DSAL’s content increased, it attracted participation from an ever wider community of stakeholder partners and content contributors, including “leading U.S. universities, the Center for Research Libraries, the South Asia Microform Project, the Committee on South Asian Libraries and Documentation, the Association for Asian Studies, the Library of Congress, the Asia Society, the British Library, the University of Oxford, the University of Cambridge, MOZHI in India, the Sundarayya Vignana Kendram in India, Madan Puraskar Pustakalaya in Nepal, and other institutions in South Asia.”28
Although support for expansion and further development of the DSAL appears to have fallen off in recent years (after the initial project development grants had run their course), and many of its external links have not been maintained due to lack of DSAL staffing (leading now to some dead-end connections), the DSAL’s own rich internal content and indexing machinery is still spinning reliably on the University of Chicago’s servers, twenty years later. Historical sources of great value are still uniquely discoverable and accessible in the DSAL.
Spurred on by South Asia scholars’ expectations for broad online full-text access raised by the DSAL, and in reaction to recent trends of monetizing and commercializing historical resources that are derived from otherwise accessible but scattered public domain source materials, a collective of about twenty-five research libraries from around the United States and South Asia is representing the interests of their South Asian studies constituencies by directly contributing money, manpower, and other resources to a newly emerging consortial project: the South Asian Open Archives (SAOA) initiative. SAOA is supported by its members’ direct in-kind and monetary contributions, joint grant submissions, and other measures to assure both development support and long-term sustainability. Administratively housed at the Center for Research Libraries (CRL)—under the aegis of the South Asia Materials Project (SAMP), the SAOA “is dedicated to creating a freely accessible, curated collection of historical research materials on South Asia,” including especially colonial-era materials in many languages (including English), such as administrative and trade reports, women’s periodicals, newspapers and magazines, census materials and gazetteers, and important literary and other monographic sources.29 SAOA has so far placed its digitized materials online in a temporary holding location on CRL’s own servers while it determines its long-term path to a permanent, more full-featured hosting and discovery setting to provide open access to digitally preserved materials, with all the valuable research functionalities currently available from commercial database providers. At present, the SAOA Executive Board anticipates launching a pilot instance of its full-featured research interface as in the first half of 2018.
Even larger consortia are capable of assembling the resources necessary to undertake even larger-scale digital endeavors. For example, CRL itself is a consortium of more than 200 member research libraries, each contributing annual dues and other kinds of support and enabling it to directly sponsor projects, secure major project grants, and underwrite other collaborative endeavors. Although most of CRL’s programs and activities are closely geared to benefit the constituency of its own consortial members, much of the products of its digital initiatives find their way into the open-access domain of the web, where they benefit scholars worldwide regardless of affiliation. For example, CRL holds vast research resources—many of them unique—preserved in microfilm. As part of its overall mission of delivering content on demand to its members, CRL digitizes reels of film for delivery. It also maintains the resulting digital files on its servers and links them to the catalog records for the items in CRL’s catalog, making them discoverable and accessible by authenticated CRL members. A recent decision by the CRL Board, however, has enabled it to make its public-domain (pre-1924) digital resources available for open access by CRL members and nonmembers alike.30 This open category includes many of the items from CRL’s very rich microfilm holdings of South Asia materials (assembled in its own collecting programs and those of its associated SAMP). See, for example, the Proceedings of the First Indian National Congress (1885). Because this kind of digitization on demand is not focused, as a program, on South Asian materials per se (and is instead directed, item by item, across the entire CRL collection by the demands of individual scholars), the result for South Asia is an online assortment that does not itself constitute a South Asian digital library, but is nonetheless a rich seam of resources that are preserved and reliably hosted with permanent URLs and catalog records, making them bibliographically discoverable.
There are also discipline-focused consortia that are creating online subject-domain archives with significant South Asian historical coverage. For example, the Law Library Microfilm Consortium (LLMC) has over 510 participating member libraries that are contributing digital content to their online members-only subscription database LLMC Digital. This preservation-oriented digital initiative has sections on the countries of South Asia which contain numerous colonial-era legislative acts, codes, judicial orders, case law, treaties, and legal treatises from the 17th through the early 20th centuries. These are invaluable resources for scholars of South Asian legal history.31
Beyond the online resources for South Asian history produced by South Asia-focused and general consortial groups mentioned in this section, there are also the materials digitized under vast mass digitization projects like Google Books, the HathiTrust Digital Library, and the Internet Archive. Some of the libraries that signed on as partners in the Google digitization project (such as the University of Michigan, Harvard, Cornell, Columbia, and the University of Wisconsin-Madison) happen to be libraries with historically strong holdings on South Asia. In this way, many important historical works were swept up in the project and are now discoverable online through simple keyword searches at Google Books, accessible either as open full text (e.g., An Account of the Kingdom of Nepaul, Being the Substance of Observations Made During a Mission to that Country, in the year 1793, by Colonel Kirkpatrick) or as snippets or “limited preview” (e.g., A History of Nepal, by John Whelpton, 2005), depending upon Google’s application of copyright. Scholars have noted gaps and limitations in the way the Google Books project was implemented: many books in non-Roman scripts were excluded, as were lots of public-domain foreign publications because of Google’s original strict 1923 cutoff policy. Also, efficient mass processing meant that certain form factors (maps, foldouts, etc.) were left out of the digital editions.
HathiTrust is a consortium of more than 120 partner libraries, focused on the mission of centralized preservation of digitized materials, especially digitized books (including those deposited by member libraries under the Google Project). But beyond preservation, the HathiTrust Digital Library also functions as a selective access platform, with nearly 8 million book titles consisting of nearly 16 million digitized volumes, of which almost 6 million are in the public domain (and therefore freely accessible in full text). Again, substantial numbers of public domain printed books on South Asian history, mostly published before 1924, have been digitized and deposited in HathiTrust and are now accessible to scholars through HathiTrust’s standard catalog search or through HathiTrust records imported into local catalogs. For example, see A journey to Katmandu (the capital of Nepaul) with the camp of Jung Bahadoor; including a sketch of the Nepaulese ambassador at home, by Laurence Oliphant, 1852. As with Google, many other books are discoverable and available for keyword searching within the book but are not viewable in full text due to copyright restrictions. (See, for example, The Kingdom of Nepal, by J. K. Chopra, 2000.)
The Internet Archive is a nonprofit digital library, supported by foundation funding, that has amassed a vast corpus of digital material (in all formats) harvested from the web and from institutional online collections. With a mission of providing “universal access to all knowledge,” the archive contains nearly 280 billion web pages and about 11 million books and texts. Organized as a means of web preservation, the archive is necessarily incomplete: its servers “crawl” target websites to download copies of the content, but the unstable nature of the source websites and the short life span of their content means that these archival “snapshots” miss much that is critical. However, digitized historical materials from a number of libraries and research centers in South Asia have been harvested from the web and archived in the Internet Archive. For example, many books from the Digital Library of India project (currently offline due to copyright violations), the Million Book Project (a.k.a. Universal Library Project),32 and Osmania University were harvested by the Internet Archive while they existed online. See, for example, An Historical, Political and Statistical Account of Ceylon and its Dependencies, by Charles Pridham (1847), sourced from the Digital Library of India, and Report of the Indian Cinematograph Committee 1927–1928, sourced from the collections of Osmania University and digitized by the Million Book Project.
These mass digitization and mass digital archiving projects are valuable because they enable access to content from libraries that were not able to put them online themselves or lacked the infrastructure to maintain them online. Because these projects include some full-text indexing, they allow for rich discovery possibilities in support of research. However, some of them (Google, Internet Archive) lack the machinery of adequate and systematic bibliographic descriptions of the items they have ingested, such as authority control on personal names and standardized subject descriptors. These projects have also tended to exclude vast swaths of foreign language material or to include them without full-text searching (due to OCR problems with non-Roman scripts). Nonetheless, the research value of being able to type an obscure name or place or keyword into a box and uncover unique historical sources (even if only presented as snippets, with keyword-in-context, with only that page displayed instead of the whole book) cannot be denied.
Commercial Online Publishers and Aggregators
Commercially produced and distributed digital resources that support the study of South Asian history, typically available to scholars through the subscriptions and licenses purchased by their institution’s libraries, are numerous. (Generally aimed at the Western research library market, the subscription and pricing models for these products often put them beyond the reach of scholars at institutions in South Asia itself.) There are, of course, dozens of e-journals by major and minor commercial publishers.33 However, another increasing area of commercial digital publication is in the realm of primary source thematic archival collections. These are sourced by the publishers from private collections or from libraries and archives who make revenue sharing deals with the digital publishers and distributors. (Many of them are digitized from collections previously assembled for sale to libraries as microfilm series back when that was a favored distribution medium.)
A representative list of digital databases with rich South Asian primary source historical material (though not uniformly well-curated) would include collections published by Adam Matthew Digital, such as:
– Foreign Office Files for India, Pakistan and Afghanistan, 1947–1980. This collection is sourced from the UK National Archives and has primary source materials on political and diplomatic history and international relations.
– East India Company: India Office Records from the British Library, 1595–1947. This collection includes a lot of Company correspondence and manuscript material and related resources sourced from the British Library, with full-text and structured searching. It contains many resources for economic and political history.
– India Raj and Empire: Manuscript Collections from the National Library of Scotland. The collection is focused on social and political history.
– Empire Online. Sourced from dozens of archives, it covers primary sources for the British Empire, including substantial colonial-era South Asian content.
Another digital publisher in this sphere is Gale Cengage, whose full-text databases of interest to South Asian history include portions of their Archives Unbound series, such as:
Gale also publishes a number of broader full-text collections online that have rich seams of important historical content on South Asia. These include Eighteenth Century Collections Online (ECCO), which encompasses nearly every book published in English during the 18th century; and Nineteenth Century Collections Online—an assemblage of more than a dozen large thematic archives. These Gale collections have thousands of books and other sources on India and South Asia, which are easy to uncover through structured or open full-text searching.
ProQuest publishes a similarly broad online primary resource with much valuable content for South Asian history: the House of Commons Parliamentary Papers, which covers more than three centuries of archives of British government documents, Parliamentary debates, and other sources—the working documents of government for all areas of social, political, economic, and foreign policy, showing how issues were explored and legislation was formed. It is filled with primary sources essential for the colonial history of the subcontinent. Another valuable ProQuest product is ProQuest Historical Newspapers, which contains searchable full text of the Times of India from July 1861 through December 2008, as well as short 19th-century runs of The Bombay Times and Journal of Commerce and The Bombay Times and Standard.
Another example of a microfilm publisher who has digitally repackaged much of their archival backlist is British Online Archives (of the parent company Microform Academic Publishers). Among their interesting special collections online, one can find:
The South Asia Archive (SAA) is a different sort of digital resource, published by Routledge/Taylor & Francis. This expensive collection of assorted published and some archival materials on topics from archaeology to law to urban planning is assembled by the South Asia Research Foundation from private collectors’ holdings and a range of other sources. Some South Asia librarians (including the author) reviewed the SAA in detail and found that a significant portion of its content is actually in the public domain; much of it is available elsewhere for free as open-access material (e.g., through Google Books and HathiTrust) and is widely available in print and microfilm at many libraries that share their collections for broad scholarly access.34 The content itself, which emphasizes Bengali-language sources, certainly includes material of interest for historical research, but it is mixed in with a miscellany of individual sources, portions of runs of journals and annual reports, etc. that don’t seem to cohere into a well-organized curated archival collection as such.
Sources for Discovery
A common theme that cuts across the rich universe of South Asian historical digital resource types lightly sampled in this article—whether presented by commercial publishers or aggregators, academic institutions or scholarly societies, consortial projects, government archives and libraries, or independent research centers, and regardless of their status as fee-based or free, or their quality as polished presentations, stable digital repositories, or ephemeral grassroots digital endeavors—is the challenge of discovering their existence and finding and using them. Contrary to the popular dream that the Google search interface has engendered, there really is no one place to go to find all this content. Discovery of individual online research resources is very much dependent upon the context of their presentation, the metadata or cataloging associated with them, and the degree to which any of that metadata is aggregated into any kind of centralized discovery framework.
Several kinds of aggregation do help expose and highlight at least some portions of that universe. Libraries have arrived at standardized ways of cataloging their books and aggregating their records into consolidated bibliographic databases, such as OCLC’s WorldCat, which contains more than 2 billion bibliographic records from more than 10,000 libraries worldwide. Many libraries also add to their local catalog records pointing to selected online resources—especially the commercial ones they subscribe to—in order to improve discovery and access for their own students and faculty. Some selective efforts have also been made to catalog highly valuable, noncommercial web-based resources (called web freebies; see “Open-Access Consortial Digital Libraries” for examples), and the world of scholars can benefit when these detailed catalog records are also fed into the WorldCat database. Thus, as a simple example, when a few major research libraries created their own library catalog entries linking to the Digital South Asia Library (DSAL), those records were aggregated into WorldCat where DSAL can be discovered by anyone, even if that person is not associated with those institutions. (See the WorldCat record).
Similarly, when a library digitizes some of its own holdings to put them online for open access (e.g., as part of its institutional repository or a digital library project), it will typically add a bibliographic record for each item to its local catalog and share that record into WorldCat. (For example, a record for Harvard’s digitized copy of An Outline of Postal History and Practice, With a History of the Post Office of India, by Ivie G. J. Hamilton, Calcutta, 1910, can be found at WorldCat, and that record links to Harvard’s open-access online full text.)
Library catalogs provide rich, structured points of access (titles, authors, publishers, series names, and multiple standardized subject headings) for the discovery of individual books, journals, and other resources in support of all kinds of research. Individual library catalogs, then, at institutions with strong research collections on South Asian history (e.g., Chicago and Harvard), are therefore good places to search. And when they include online resources and then aggregate their records into shared bibliographic databases like WorldCat, their value for centralized discovery thus becomes much greater.
The problem with cataloging, though, is that, like all item-level structured metadata, it is labor-intensive and expensive to create, and most libraries therefore tend to focus that investment primarily on items in their own stable print or digital collections. In other words, these institutions tend to catalog only what they own or create themselves, leaving great swaths of external online resources uncataloged and undiscoverable through their catalogs (or through WorldCat). While Google itself can uncover some of that larger universe of web-based resources, it lacks the structured access points that a catalog record contains. And only a portion of the world’s web-based resources are crawled and indexed by Google’s search engine.
A partial solution to this problem is created by South Asia subject specialist librarians (at institutions that have one), who compile subject-based thematic portals or directories of linked online resources organized by subject. Many research libraries provide an infrastructure for their subject librarians to create curated hierarchical thematic collections of links (or portals) to hand-picked online resources for specific subject constituencies. These collections are often referred to as LibGuides or Research Guides. South Asia librarians compile and maintain such guides, providing hierarchically structured browsable online “collections” of South Asia resources of all types, to facilitate discovery of those resources by their students and faculty beyond whatever is included in their library catalogs. The earliest examples of this kind of resource were the South Asia Gopher, and SARAI (South Asia Resource Access on the Internet)—both described in “Background.” Nowadays efforts along these lines tend to be more selective and realistic, identifying smaller sets of high-value resources rather than aiming for any degree of comprehensiveness. A typical current example would be the South Asian Studies Research Guide at New York University, which is compiled and maintained by the author of this article.35 These research guides are themselves generally hosted as open-access resources by the institutions that have them, though they often do contain restricted locally licensed resources mixed in with their links to open resources on the web. (In this NYU example, there are links to open-access materials such as the South Asian Migration Histories as well as to commercial e-journals such as MARG and databases such as The East India Company, which are accessible there via login only for NYU affiliates (or via the local subscription, if any, at the user’s own institution.)
The focus so far has been on digitized or born-digital primary and secondary online sources and the challenges of discovering and accessing them. Scholars of South Asia, however—especially in history and the humanities—face another kind of challenge to research effectiveness: article-level discovery for journals and book chapters. Much of the contemporary publishing output in South Asia is still primarily or exclusively in print. Even where libraries hold copies of these print journals and books, how can one discover the book chapters or articles they contain on topics they are researching? The standard catalog record will only tell you about the book (not its chapters) or about the journal (not its articles).
The solution to this problem of article-level discovery is another general class of online research resource: bibliographic indexes. For South Asian book chapters and journals, there are several indexes that cover print materials. The two key exemplars of this type of resource are the Bibliography of Asian Studies (BAS), and SALToC (South Asian Language Journals Cooperative Table of Contents Project).36
The BAS is a bibliographic index and discovery engine with nearly 1 million records for chapters in edited volumes, festschrifts, and conference proceedings, as well as articles in journals on all topics of Asian Studies. It is produced by the Association for Asian Studies (AAS), and is distributed online under contract with EBSCO. This database is widely considered the one indispensable research tool for scholars across the full interdisciplinary range of Asian Studies and is often promoted by the libraries that subscribe to it as the starting point for research at all levels—from undergraduate research papers to doctoral dissertation research and beyond. The BAS has full search and discovery features that lead to citations, enabling the researcher to discover the desired article or chapter and then find it in print or online in full text where available. BAS citations can link back to local catalog records for print books or journal holdings in one’s own institution’s library catalog, as well as to an online instance (if one exists) of the discovered article among the e-journal subscriptions of one’s own institution. Although the BAS itself contains only bibliographic information about each article (enabling aggregated discovery), it also facilitates easy access to the material, through active linkage.
Because the BAS’s linguistic focus is limited to Western language publications, and because English is the largest language of publication in the countries of South Asia, the region receives disproportionately deep and broad coverage in this database, including full indexing of many English-language journals published in the subcontinent and broadly held in US and European libraries, but not indexed anywhere else.37
Nonetheless, the BAS cannot serve the needs of advanced history scholars who need to discover relevant articles in journals published in the languages of South Asia. Most of these journals are produced only in print, and although there are significant holdings in the libraries of US and European institutions with strong South Asian Studies programs, discovery of articles in these journals is doubly difficult. These are inherently low-use, specialized materials in these libraries and are therefore frequently relegated to offsite storage facilities, from which scholars can request them for delivery if they have a specific citation. But since there is no published indexing for these journals (and no online presence available for searching), one finds it impossible to discover citations for the wealth of individual articles they contain. In the past, one could at least browse them on the shelves of the holding libraries and find relevant articles that way, but offsite holdings are not browsable, so usage declines, further warranting the library decision to either withdraw them or relegate them to remote storage. Discovery and access, then, is the victim of a Catch-22.
A modest, experimental attempt at a solution to this problem is being undertaken as a collaborative open-access project by a group of twelve research libraries: the SALToC Project (South Asian Language Journals Cooperative Table of Contents). This project has each participating institution provide simple PDF scans of the tables of contents of the target South Asian language journals and submit them to a centrally aggregated, free, open-access website hosted as part of the institutional repository of New York University (and managed and edited by the author of this article). To keep the processing low-tech, the staffing minimum, and the whole project scalable and sustainable, SALToC makes no attempt to create a searchable database of articles as BAS does. The goal of SALToC is simply to provide a valuable substitute for the now lost ability at many libraries to browse the tables of contents of these journals in the library stacks. Here one can indeed discover needed articles and request them for delivery from offsite storage at one’s own institution or from interlibrary loan and document delivery services provided by cooperating libraries. To expand the discovery value of SALToC, links to the browsable tables of contents for each journal are added to the catalog records of the libraries and to the records for those journals in WorldCat. This model provides a stable online resource for discovery through the permanent URLs of NYU’s preservation-committed institutional repository. At present, SALToC consists of 1,519 table-of-contents files in 10 South Asian languages, covering long runs of 24 journals, supplied by 12 partners. It is expected to grow gradually as more and more institutions participate to leverage the value of their investment in print journal subscriptions from South Asia.38
This article provides a rough taxonomy of types of online resources for the study of South Asian history, broadly organized according to the kinds of organizations that produce them (national libraries, universities, research centers and societies, collaborative consortial groupings, and commercial publishers and aggregators), with just a few examples of each, followed by an overview of the challenges of discovery and approaches to their solution. For any serious scholar of South Asian history, the most essential conclusion to be drawn from this article is that—unlike many other fields—the content accessible online, though rich and growing and ever more discoverable, is still only the tiny tip of a vast iceberg, most of whose value must still be plumbed by consulting old-fashioned print. As always, it is libraries and librarians who, within the limits of respect for copyright, are mining that content and bringing it to the surface on behalf of preservation, discovery, and access. And it is the South Asia subject specialist librarians themselves (and the broad human networks of specialists they deploy for insiders’ advice) who are the best resource for pinpointing the perfect sources—in print or online—to meet scholars’ research needs.
(1.) See, for example, Sanjay Joshi, “Colonial Notion of South Asia,” South Asian Journal: Quarterly Magazine of South Asian Journalists and Scholars 1 (March–September 2003): 6–9, and other papers in that issue of the journal.
(2.) B. M. Gupta et al., eds., Handbook of Libraries, Archives and Information Centres in India. New Delhi: Information Industry Publications, 1984–.
(3.) As examples of recent observations of dismal conditions, see Choodie Shivaram, “How the National Archives of India Is Actually Destroying History,” The Wire, May 24, 2017. Also see articles by Dinyar Patel such as “India’s Archives: How Did Things Get This Bad?,” New York Times India Ink Blog, March 22, 2012; “In India, History Literally Rots Away,” New York Times India Ink Blog, March 20, 2012; and “Caring for History: Archaic Conservation Methods Are Themselves Hastening the Deterioration of Fragile Archival Material in India,” Himal Southasian 26, no. 1 (January 2013): 158–164.
(4.) There are, of course, caveats and a few counterexamples, such as with some archival records where the copy archived by the colonial authority lacks pages that survive now only in the local state archives in the former colony.
(5.) Some of the strongest North American library collections on South Asia are held at the more than forty institutional members of CONSALD—the Committee on South Asian Libraries and Documentation. Most of these started collecting in earnest in the 1960s or later, after the advent of the Library of Congress cooperative acquisitions programs based in India and Pakistan, subsidized by the US government’s PL480 program. (See Maureen Patterson, “The South Asian P.L. 480 Library Program, 1962–1968,” Journal of Asian Studies 28, no. 4 (August 1969): 743–754.) However, some US institutions, notably the University of Pennsylvania, the University of Chicago, Columbia University, Harvard University, and the New York Public Library, had started building South Asia-focused collections—especially in history, philosophy, and religion—many decades earlier.
(6.) For example, the South Asia Materials Project (SAMP), the Microfilming of Indian Publications Project (MIPP), and others. SAMP, in particular, has a long and eminent history in the collaborative preservation of South Asian research resources. See the 1988 project history by Jack Wells, “The South Asia Microform Project.”
(7.) Patricia Libutti, ed., Librarians as Learners, Librarians as Teachers: Diffusion of Internet Expertise in the Academic Library. Chicago: ARCL, 1999.
(8.) See David Magier, “The South Asia Gopher,” posting on the H-Asia Discussion List, 1994. Another preweb example from the 1990s was the Indology Listserv, initiated by Dominik Wujastyk at the then Wellcome Institute for the History of Medicine. Indology continues to be a major forum on manuscript and other sources of ancient and premodern South Asian history and has included many Indological texts transcribed by individual scholars and shared online in standardized formats.
(9.) SARAI was launched at Columbia in 1996 as a hierarchically structured web portal to a curated selection of the best internet resources on South Asia. Maintained there by South Asia librarian David Magier, with input from area specialists and librarians around the world, SARAI grew exponentially as a community resource until it was retired around 2010. Even by 2005, SARAI was viewed (in a review by Alyssa Ayres and Philip Oldenburg, India Briefing: Takeoff at Last? Armonk: M.E. Sharpe, 2005, 195) as having “grown into the mega link for all South Asia links.” The History Highway: A 21st Century Guide to Internet Resources, eds. Dennis A. Trinkle and Scott A. Merriman (Armonk: M.E. Sharpe, 2006, 131), considered that of all the Virtual Library sites, SARAI “has one of the cleanest and [most] useful opening pages, from which users can move quickly to a wide range of specific materials.” As of this writing, archived versions of SARAI still persist at Columbia here and here.
(10.) A number of factors might explain why South Asian content was so slow to appear on the internet. One would be that well-funded mass digitization efforts like Google prioritized English language and Roman script sources. There are estimated to be more than 700 languages spoken in South Asia, and even accounting for the long colonial prominence of English, there is an immense heritage of materials in many of these languages. Nonetheless, with English as the modern international language and the language of science and business in the subcontinent, domestic IT professionals themselves did not focus on developing OCR for most non-Roman scripts of the region.
(11.) There is a vast literature on the notorious instability of URLs and citations to web-based content. See, for example, Wallace Koehler, “A Longitudinal Study of Web Pages Continued: A Consideration of Document Persistence,” Information Research 9, no. 2 (January 2004), which monitored a sampling of web pages between 1996 and 2003 and found that they had a half-life of only two years (i.e., on average, 50 percent of web pages disappear within two years). With the reduction in barriers to self-publishing on the web and the ubiquity of internet access worldwide, the average half-life of web content has presumably grown even shorter since then.
(12.) Too many of these commercial products are quite expensive and exhibit the negative consequences of the trend toward Western monetization of public-domain historical content from the subcontinent. So aside from any of their shortcomings of discovery, interface, or content, these products are also often simply not accessible to historians in South Asia, whose institutions cannot afford them, raising obvious questions about the ethics of neocolonial exploitation of cultural heritage. (See Clifford Lynch’s presentation on open access in the panel on “The Future of Scholarly Publishing: Alternative Perspectives to the Commercialization of Knowledge: Panel Discussion Held during Open Access Week”, sponsored by KU Libraries’ Center for Digital Scholarship, October 26, 2011.)
(14.) For example, see Surabhi Agarwal, “National Archives of India Set to Get a Digital Makeover of Its Vast Repository,” Times of India, August 15, 2017. Likewise, the vast colonial-era records of the Delhi Archive are also slated for digitization (and hopefully online open access). See “Delhi Archives to be Digitised, Microfilmed,” The Hindu, September 1, 2017.
(15.) Interesting counterexamples would include some specialized online resources, such as the “Old Sikkim Documents” and “Treaty and Agreements“ at the Sikkim Archives. Also valuable for local history research is the State Archives of Rajasthan, which has digitized tens of thousands of documents by district and put them online for open access, where users click “login” and create a free personal account. The National Archives of India has put up a separate specialized site with full text of the large Netaji Subhas Chandra Bose Papers—a set of essential primary sources for researching the ferment of nationalist movements and party politics in India from the 1920s through World War II. The West Bengal Public Library Network also has an open online institutional archive with some historical content, such as gazettes from the 1880s. The Balochistan Archives of the Government of Balochistan, Pakistan has digitized a substantial archive of historical manuscripts and documents. However, its website seems to have gone dark since June 2017, leaving only a trace preserved at the Internet Archive.
(16.) For example, the BL is preparing a very ambitious open-access digitization project—“Two Centuries of Indian Print: 1713–1914.” Currently still in its pilot phase, digitizing several thousand Bengali language books, the project promises to be a major boon to historians once its contents are placed online.
(17.) Cited in Margaret Coutts, Stepping Away from the Silos: Strategic Collaboration in Digitisation (Cambridge: Chandos Publishing, 2017), 70.
(18.) Likewise, the Digital Resources Centre of the Maulana Azad Library of Aligarh Muslim University is engaged in digitizing many books and journals from the AMU collections, but so far only a very small selection of these have been placed online. For example, a portion of the useful works of Sir Syed Ahmad Khan have appeared there in full text. The online open-access institutional repository of North-Eastern Hill University, Shillong, contains digital editions of many rare historical sources related to North-East India digitized from its library’s holdings, as well as the scholarly output of its own scholars.
(19.) Other similar examples would include the University of Michigan Digital Library, with open-access books like A journey to Katmandu (the capital of Nepaul) with the camp of Jung Bahadoor (1852), and Cornell University Library’s Southeast Asia Visions online collection, which includes Wanderings and wonderings: India, Burma, Kashmir, Ceylon, Singapore… (1892).
(20.) See also the historical data extracted in spreadsheet form from the Statistical Abstracts of British India (1840–1920) online as part of the Digital South Asia Library.
(21.) Universities in the West that have digitized parts of their holdings tend to have catalog records for the online content they have loaded into WorldCat for discovery. But a higher proportion of materials digitized by institutions in South Asia and other parts of the world either have no catalog records for the digital product, or they have catalog records that do not adhere to international standards for bibliographic data structures, or else the records have been loaded only into their local library catalogs and are not visible in integrated discovery databases like WorldCat. There is hope that emerging systems of linked data may eventually make bibliographic discovery possible across diverse sources and divergent metadata structures [see, e.g., Eero Hyvönen, “Publishing and Using Cultural Heritage Linked Data on the Semantic Web,” Synthesis Lectures on the Semantic Web: Theory and Technology 2, no. 1 (October 2012): 1–159], but even the enhanced capabilities of the Semantic Web appear to require some level of institutional coordination and standardization that may be out of reach for the small libraries in South Asia that are digitizing their unique holdings.
(22.) Preservation and digitization of select content from these research centers has been supported by the BL’s Endangered Archives Program, and much of the content is available through open-access online presentation at the EAP. EAP collections of materials have been assembled from digitization projects from the Roja Muthiah Research Library (Chennai), from the Center for Studies in the Social Sciences Calcutta, from the Madan Puraskar Pustakalaya (Kathmandu), from the Sundarayya Vignana Kendram (Hyderabad), and from the Mushfiq Khwaja Library and Research Centre (Karachi). (Many parts of these digital archives are still in development or not yet fully visible online.)
(23.) Another international collaboration of partner institutions, including Project Denjong, the British Library, and the Namgyal Institute of Tibetology (Gangtok, Sikkim), has been working with the Sikkim royal family to digitize the Sikkim Palace Archives, digitally preserved in the BL’s EAP. Similarly, scholars associated with L’Agence nationale de la recherché (ANR) in France and with the Deutsche Forschungsgemeinschaft (DFG) in Germany have created the online digital resources of the project entitled Social History of Tibetan Societies, including numerous scanned archival documents (e.g., Documents from Mustang.
(24.) Although the DLIR is no longer being updated, its bibliographic and full-text resources appear to be hosted in stable fashion by its partners.
(25.) See, for example, Pakistan Journal of History and Culture, from the National Institute of Historical and Cultural Research (Islamabad), and the Indian Journal of History of Science, from the Indian National Science Academy (New Delhi). The Association for Asian Studies publishes its peer-reviewed Journal of Asian Studies—which has substantial South Asian historical content—commercially via Cambridge University Press.
(26.) This is not to say that proper digitization to meet established international standards for image quality, resolution, and preservation is cheap or easy. (See, for example, “Minimum Digitization Capture Recommendations”  of the Association for Library Collections and Technical Services Preservation and Reformatting Section.) Meeting these standards requires training, equipment, and institutional infrastructure that may be beyond the means of many organizations seeking to digitize their historical holdings, let alone that of individual scholars digitizing their private collections. Even where these technical standards can be met, the expertise and expense of creating usable metadata for discovery is generally even more unattainable.
(27.) Subject to the same kinds of limitations, there are also small-scale, well-intentioned efforts by individuals and small groups to produce open-access online resources on specialized topics. Examples include the Bangladesh Genocide Archive, an online archive of the chronology of events, documentation, audio, video, images, media reports, and eyewitness accounts of the 1971 genocide in Bangladesh at the hands of the Pakistan army, as well as the 1947 Partition Archive, a crowd-funded and crowd-sourced oral history archive of personal Partition stories.
(31.) Scholars of South Asian legal history will also find that Kanoon has much full-text open access Indian content of value. This service is primarily aimed at improving discovery and access of legal documentation by law practitioners, but its free component does include High Court judgments by state going back to the 19th century, as well as law tribunals, Law Commission reports, Constituent Assembly Debates, and Lok Sabha Debates.
(32.) A mass-digitization project, supported by the National Science Foundation and Carnegie Mellon University, which ran digitization centers all over India for several years in the early 2000s and yielded digitization results of mixed quality.
(33.) For example, South Asian history journals published by SAGE,
as well as by Taylor & Francis, such as South Asian History and Culture. Many of these e-journals are now bundled by their publishers and distributors into large, broad package deals with content that goes well beyond South Asia itself.
(34.) For example, A collection of the acts passed by the Governor General of India in Council in the year … is held in print at Duke University, University of Chicago, the BL, and other research libraries. Likewise, The Indian Review is held in print and microfilm at dozens of libraries worldwide. Although the author has not conducted a comprehensive statistical survey, reviewing the actual lists of holdings of the South Asia Archive reveals that very many of its monographs, reports, and journals are accessible in library print holdings, and quite a few are already digitally accessible (e.g., through HathiTrust and Google Books).
(35.) Many other similar examples exist at varying levels of detail and organization and also varying in their inclusion of restricted, locally accessible resources integrated with internet-based open-access resources. Examples include those at Columbia, Chicago, Harvard, University of Washington, University of Hawaii, University of Texas, University of Virginia, and many more. All of these demonstrate the potential values of selective curation by a subject specialist building a “collection,” as well as the intractable problems resulting from the notorious instability of online resources. Even at high selectivity and small size, these collections are virtually impossible to keep up to date, let alone scale up for broader coverage: they contain numerous “dead-end links.”
(36.) There have been other efforts to create South Asia-focused bibliographic indexes. An example of an open-access indexing project operated for several years by a single research library is the South Asian Periodicals Index at the University of Wisconsin. Small, well-intentioned, single-institution projects like this are hard to sustain due to limited soft funding to support them, staff turnover, and limited human resources.
(37.) Many hundreds of journals from and about South Asia are indexed in the BAS, including those available only in print, as well as a number of key e-journals. Coverage of print journals important for history include Ancient Pakistan (Peshawar), Annals of the Bhandarkar Oriental Research Institute (Pune), Bengal Past and Present (Calcutta), Central Asia (Peshawar), Deccan Studies (Hyderabad), Gandhāran Studies (Peshawar), Indian Journal of History of Science (New Delhi), Indica (Bombay), Journal of Ancient Indian History (Calcutta), Journal of Humanities and Social Sciences (Calcutta), Journal of the Indian Society of Oriental Art (Calcutta), Journal of the Pakistan Historical Society (Karachi), Pakistan Journal of History and Culture (Islamabad), Quarterly Journal of the Mythic Society (Bangalore), Quarterly Review of Historical Studies (Calcutta), South Asian Studies (Jaipur), and many dozens more. Of course, the BAS also includes chapter-level indexing of tens of thousands of books.
(38.) For more background on SALToC, see Aruna Magier, “Cooperative Collection Development Requires Access: SALToC—A Low-Tech, High-Value Distributed Online Project for Article-Level Discovery in Foreign-Language Print-Only Journals” (2014), in Proceedings of the Charleston Library Conference, eds. Beth R. Bernhardt, Leah H. Hinds, and Katina P. Strauch (West Lafayette: Purdue University, 2015), 213–218.