Innovative middleware lowers the administrative burden of managing large databases
As much as the billions of pills, capsules and injections that the biopharma industry produces, it generates even more data. Clinical trials generate terabytes of data; sales operations and marketing research assemble vast pools of prescriber and patient data; commercial operations require mountains of data for recordkeeping and regulatory compliance.
In recent years, the IT response to these data compilations has been to create data warehouses or marts—repositories of stored enterprise data organized to provide ready results on current business and research conditions—business “intelligence” rather than business “information.” But as the volume of data continues to rise, so does the number and size of these warehouses, producing both a higher IT infrastructure cost and a decline in the availability of desired data.
Computer hardware, data storage, and enterprise application developers have responded to these challenges by establishing the practice of “virtualization”—ways of organizing data storage systems or computer power to resemble much larger systems by maximizing the on-line performance of the systems. Now, virtualization has been extended to data itself.
“Two of the problems with accessing large business databases today are the complexity of building applications that access these data quickly and reliably, and the difficulty of translating data in one form in a database into other forms usable by applications,” says Robert Eve, VP of marketing at Composite Software (San Mateo, CA). Composite has developed a set of middleware tools, including Composite Information Server and Composite Studio, to simplify the data-gathering and presentation process. “Ideally, the business analyst or other user of enterprise data doesn’t need to know either where the necessary data resides, or what format it is in. The middleware takes care of these issues by abstracting the data and making it readily available.”
Data services
IT developers have been working on the access and translation problems for several years. Various approaches, such as purpose-built ETL (extract, transform and loading) applications, or enterprise information integration (EII) have been adapted. The advantage of data virtualization is to reduce the application-specific coding necessary to “join” data from disparate sources, and to minimize the computing burden in calling up and then presenting the data.
Data virtualization also complements another trend in information systems architecture—the move toward service-oriented architecture (SOA). Among other things, SOA decouples large, monolithic applications into assemblies of reusable “blocks” of code. When written according to industry standards, the blocks of code can be revised or updated without the necessity of revising the entire application.
SOA needs a “data services layer” to connect the applications to stored data. The data services layer (which is another term for data virtualization) is often the largest component of an SOA software implementation. Web services (using the standards of Internet communications to manage data flow among applications) also makes a neat fit with SOA and data virtualization; again, writing code to appropriate industry standards is a necessary element of building the data services layer.
Cache when you need to
In practice (see figure), the Composite tools interact with Web services or database queries from client applications, employing a “query processing engine” to obtain and translate desired data. Some applications are one-time, “on the fly” queries that are built and delivered. Others, representing routine or repeated queries (such as, for example, a daily or weekly activity report from a sales force) can be set up and stored for re-use either in a data cache (a temporary location) or a more permanent dedicated storage system. (Another Composite tool, Active Cluster, meets this and related backup requirements.)
Users are trained to apply the Composite Studio Modeler to build desired applications. An ancillary part of the data virtualization process is to abide by system security requirements; Composite’s Eve says that the virtualization tools replicate security options without creating a new set of options that would themselves need to be validated and maintained.
Another step in the development process is to create specific business objects, such as “customer reports” or “purchase orders” and the like. Eve says that Composite has built over 80 of these objects, preconfigured to work with data structures such as those in SAP enterprise systems or Oracle databases (including Oracle Siebel customer-relationship management applications), to expedite the data services presentation.
Through OEM agreements, Composite’s software is built into business intelligence and reporting tools of IBM Cognos and several other leading reporting applications. Composite Software provides training and system-optimization services for clients, and works with leading systems integrators to implement its tools.
Composite’s software has been used by several leading biopharma organizations. At one, overall R&D project development time was cut by 5% through faster deployment of strategic planning information to key line-of-business senior executives and managers; plus business-analyst productivity essentially doubled by avoiding the need to build and maintain permanent data marts for one-time, nonrecurring use of a data service.