For information about advertising in Database Trends and applications, click here.
To subscribe to our e-mail newsletter, Five Minute Briefing: Data Integration, click here
For information on reprints and electronic reprint rights, contact tom@dbta.com

Submit Editorial



Database Trends and Applications accepts editorial submissions. For guidelines and an editorial calendar, click here.


Contact the editor

Contact the publisher

Contact the president of Unisphere Media

Contact the Webmistress

 

 
 


Make Room for the Monster Databases
By Joe McKendrick

They're big, and they're getting bigger. New statistics out of Winter Corporation - an analyst firm that tracks database size on an annual basis - find that databases are poised to hit the 100-terabyte mark within the next two years. Remarkable, considering that just a year ago, the largest databases on record were between five and ten terabytes in size.
...
Some of the biggest databases are growing by a factor of 20, Richard Winter, CEO of Winter Corporation, told DBTA. "They're projecting growth in databases by 2004 that are in the range of 100 terabytes of data," he said. Even more remarkable is the fact that Winter is only measuring actual data stored in his annual survey - not system overhead. A hundred terabytes of data "easily equates to more than 500 TBs of disk," he states.
...
Much of this growth is around decision support functions, according to the latest Winter survey. "We found that the average growth projection for decision support databases over three years was 169 percent . For transaction processing systems, the rate was 124 percent ," Winter said. "On the average, decision support databases were a half a terabyte larger than operational systems," he added. The Winter survey also calculates that such databases will grow to more than two terabytes larger. "Decision support databases are growing much faster than transaction processing databases - they're almost tripling in size over the next two years."
...
At the core of this decision support, of course, is customer data. CRM was the principle application among the very large databases measured by Winter, followed by e-commerce. Interestingly, most of these applications for very large databases were custom developed by the end-user company. For example, the survey found that 41 percent of CRM solutions were custom developed, versus 7 percent acquired through a DBMS vendor, and 7 percent through anther third-party application. Another 45 percent were not sure of the source of their CRM applications.
...
Of course, most companies don't have multi-terabyte behemoths on site. In fact, most are still working on reaching their first terabyte. In the mainstream, typical large databases range anywhere from 100 GBs to 500 GBs in size, Winter reported. Such database power is in evidence at Foxwoods Resort Casino in Connecticut, the nation's largest casino complex. The resort casino, owned and operated by the Mashantucket Pequot Tribal Nation, relies on a 300 GB database for both marketing and casino financial information. The system, which runs on a Progress database, currently supports about 20 million transactions a day, according to Todd Williams, MIS database and DSS manager for Foxwoods, in an exclusive interview with DBTA.
...
The casino's player management system is a "customer tracking and loyalty program the business users to track activity, better understand profitability, and better manage those key customer relationships," said Williams. "Much of the core application database consists of millions of patrons, and hundreds of millions of player rating transactions. These ratings help track patrons' gaming activity which provides insight into the Marketing Initiatives. Those transactions also feed into a loyalty program that allows patrons to earn comp points. So a certain amount of gaming play gets rewarded with 'points' that can be used at retail shops, restaurants, and for many other casino offerings." The other half of the database supports many other casino operations, including financials, Williams added. The Progress database system runs on IBM pSeries hardware running AIX.
...
As found in the Winter survey, the need for customer insights drives a number of large database deployments. For example, GHS Data Management, a service bureau to state agencies and insurance providers in Maine, relies on a 100 GB database to monitor prescription benefits. GHS provides pharmacy benefit management for the private sector and MaineCare, the state's public healthcare assistance program. The data warehouse itself is supported on an IBM pSeries processor running a Pick database on AIX. The front-end data analysis portion, provided through Databeacon, runs on Microsoft SQL Server on Windows 2000.
...
"For transactional processing, we store everything in a normalized OLTP structure, for our baseline format," says Jason Skeffington, project manager at GHS. "We then build star schemas and cubes from the relational structure. We have analysts here who do heavy-duty querying and aggregation that create standard reports that we publish online to our clients." Data analyzed helps ensure that clients are able to purchase prescriptions in the most efficient manner possible. "We provide an online, real-time, claims adjudication system," Skeffington told DBTA. "When a pharmacist enters a patient's information into his PC to process a prescription, we're at the other end of the pharmacy computer. Our system takes care of client info, drug pricing, co-pays, prior authorization, and pharmacy billing."
...
So far, there have been few performance issues with the large amounts of data flowing through the system, which consists of an eight-way processor with a half-terabyte ultra fast SCSI back-end, according to Skeffington. "Up until a year ago, we had a smaller dual-processor server with a quarter-terabyte back end. That was doing okay, but we could see down the road that if we wanted to do heavier reporting and launch an online decision support system, we needed to ramp up."
...
At GHS, the benefits were most immediately apparent in terms of IT staff productivity. "Customers hardly talk to me anymore; they just go online," Skeffington said. That's fewer queries he has to make, fewer reports he has to write, and, therefore, less money he has to spend, he pointed out. Maine state officials are also able to analyze spending trends online, and thus make appropriate adjustments that could save taxpayer dollars.
...
The benefits incurred from large database deployments can be felt across sponsoring organizations. In the Winter survey, 26 percent of respondents cited increased performance and the ability to meet workload, growth, scalability demands. Another 14 percent cite the single view of enterprise data that such a database provides, while 13 percent credit their deployment with making IT a competitive and business asset.
...
Ace Hardware, a chain of independently owned hardware stores, has been tracking the metrics of its own large customer data warehouse implementation on an NCR Teradata system. The company reports that its analysis to create promotional campaigns "takes hours instead of days," and found a positive response rate (decision to buy) doubled from 5 percent to 10 percent . The data warehouse also made the company's first-ever national sales campaign possible, said Diane Flynn, data warehouse manager for Ace. "Customer service and neighborhood convenience differentiate Ace independent retailers from our 'big-box' competitors," she explained. "And to continue to build on those factors, we knew we needed a data warehouse that would enable us to store and analyze transaction data for insight into customer preferences and behaviors."
...
While there are many technical challenges to address in deploying large databases, the most daunting obstacles are political ones, the Winter survey found. Almost a third of the group say organizational issues hampered the progress of their projects. "The most widespread obstacle was getting the buy-in from all levels of the organizations," said Kathy Auerbach, vice president of Winter Corporation. "You have to have the users buying in to saying, 'yes, these are the requirements we need, and when you give us this system, we're going to use it, and we're not going to hold back and do things the old way.'"
...
For instance, political issues played an inhibiting role at one Midwestern retailer that had grown its database to 30 TB, said Debbie Smith, database analyst with NCR Teradata and former database administrator at the site. While the system grew in leaps and bounds, it was sometimes a tough sell to management to justify ongoing improvements. "One year, we tripled our size of data, filled it up, and the next year we had to double it again," she related. "We were good at capacity planning, but we just didn't communicate it effectively." The key to such communication is documenting ROI as the database grows, Smith explained. "By not having ROI, it is difficult to present the business case to the business users. Technical details about costs of upgrades to support number of queries, it doesn't mean anything to them. All they should care about is if they're getting their information in a timely manner." This is new territory for many companies, she added. As databases continue to grow beyond their current boundaries, the largest deployers will continue to blaze new trails in terms of garnering organizational support, she said.

For more articles by Joe McKendrick, click here

University of District of Columbia Opts for OpenInsight
By Kara Kridler

It's called seeding the market. Software companies work hard to get their products into the hands of college students for one simple reason: The tools they learn in school today will be the tools used in the workplace tomorrow.
...Revelation Software has now joined the ranks of database vendors whose products are taught at the university. Next semester, computer programming students at the University of the District of Columbia (UDC) will learn the basics of database programming using the company's flagship OpenInsight applications development environment.
...
Carl Friedman, an associate professor in the Computer and Information Systems department at UDC, decided to use OpenInsight. He was first introduced to Revelation in the 1980s. At that time, he was doing a great deal of consulting for city government agencies in Washington D.C. His clients were looking for a database program to analyze data. Friedman attended a presentation where the company was offering its Advanced Revelation multivalue database, which, at the time, ran under DOS. He tried the free tutorial and liked what he saw.
...
"It was a fantastic package," Friedman said. "I toured through the whole tutorial in a weekend. It just beat everything on the market. There was nothing close to it. Even today, it cannot be touched. I was in love with it," he said.
...Friedman has stuck with the product through a series of corporate twists and turns over the past 15 years. Mike Ruane now heads Revelation Software, whom Friedman described as a "super programmer." The company unveiled its latest version, OpenInsight 4.1, this fall. OpenInsight is a complete application development environment with a multivalue database as its centerpiece.
...
With the new release, Revelation has shed its database of much of its original DOS origins and the support for some of the features he uses has been diminished, Friedman said. Nevertheless, he said, the overall package is a "developer's dream" and is still the best database program available. He was been working with Ruane and Revelation to develop strategies that will help OpenInsight find a wider audience. People who are unfamiliar with multivalue database concepts have to go through a learning curve, he noted.
...
As part of his commitment to broadening the user community for OpenInsight, Friedman has decided to offer a class to his students. The older version of Revelation could not reasonably be taught in a world dominated by Windows-based programs, he noted. The earliest Windows-compatible versions were also not appropriate. But with the latest version OpenInsight, Friedman felt that it was time to take the plunge and introduce multivalue database programming to his students.
... Friedman will use the program in a class called Database Programming. The majority of Friedman's students are seniors, who have already studied several different programming languages. Students enrolled in the class must be familiar with the commonly used databases, such as Oracle, IBM DB2, MS SQL Server and MS Access. "I consider OpenInsight the best multivalue database on the market today and that is why I had no hesitation teaching it. The tools that come with it also work just handily on Oracle and Lotus Notes," Friedman said.
...
Even though Friedman is focused on the multivalue aspect of the program, he is enthusiastic about the other skills his students will obtain. "Everything they [his students] are learning, except the actual multivalue aspects themselves, can be transferred over to all of the other major database systems," he said. Friedman added that the most important aspect of using OpenInsight in his class is that when his students are finished with his class, they will have marketable skills regardless of what they encounter.
...
The intention is for OpenInsight to help his students develop tools that the other programs do not supply. "You can run the tutorial and after running the tutorial you can put together an application," Friedman said. OpenInsight should be a good match for UDC students since they tend to remain in the DC area after graduation and there is an existing demand in Washington for these programming skills. Friedman hopes his students will have a competitive edge for those jobs. His students will have an advantage over employees that do not have OpenInsight experience. Without experience, employees will have to be sent away for training. This is timely and costly to the employer.
...Friedman said that most important factor in his decision to introduce OpenInsight in the classroom was because he strongly dislikes inefficiency. He wants to see more people doing things the fastest way. "If I do my job well, I expect my students to be in a position, within a shorter time than say other schools because I have older students, to have some influence in what software selections are made and OpenInsight may be then become part of the decision mix or the solution mix in their organization," Friedman said.

Kara Kridler is a freelance writer living in Washington D.C.

back to top

Tellabs Copes with Database Proliferation
by Walt Jordan

Tellabs is a major provider of telecommunications infrastructure equipment. It manufactures cross connects and switches for a wide range of customers, including AT&T and the major Internet service providers.
...Over the past several years, the data management challenges at Tellabs, which has more than 5,000 employees, has increased dramatically. In fact, when Praveen Gautam joined Tellabs in March 2000, first as a database administrator, and then as the manager for global information services, the company had five databases. Gautam and three
associates could manage those databases using in-house tools. Today, Gautam and his group, which consists of three teams, manage more than 100 databases. DBTA editor Walt Jordan talked with Gautam to understand how he does it.

Jordan: Who is in your group and to whom do you report?
Gautam: I have three teams-three SAP ERP administrators, four database administrators and two batch-and-print administrators working with Tivoli. I report to the director of North American operations.

Jordan: What was the situation when you started at Tellabs?
Gautam: I was hired as a DBA. At the time, Tellabs was starting to work with Oracle. We had SAP running on Informix, which was our biggest database with six to seven terabytes of data. Our technical architecture group decided to standardize on Oracle for all our new implementations. And one year, the number of databases grew. Now there are close to 100 Oracle databases.

Jordan: What led to that proliferation?
Gautam: At the time, the economy was doing well, and Tellabs was growing at a rapid rate. We were implementing a portal using Broadvision. We were implementing Clarify and Documentum. All those applications were coming into production. And there were a lot of in-house initiatives going on. For those, Oracle was chosen to be the back-end database.

Jordan: Who managed the databases?
Gautam: There were two DBAs for the Oracle databases and two for the SAP implementation Informix.

Jordan: Your responsibilities were?
Gautam: We had to support our production environment. We had to support our application teams for the new databases.

Jordan: How did you do that?
Gautam: When I joined Tellabs, there were no standards and no tools to manage database administration. I was doing things at the command line. But when the work began growing so rapidly, we didn't have the luxury of working at the command line and making mistakes and losing time.

Jordan: So what was your strategy?
Gautam: I was experienced with DBArtisan from Embarcadero Technologies. I liked it because it allows you to do a lot of the things that you do at the command line, but there is little chance of making mistakes. So I thought that would be a good approach.

Jordan: How did you make the decision to add the tool?
Gautam: After six months, I was promoted to be the team leader, and I was asked to cross train the others in the group to become Oracle DBAs, because the number of Oracle databases was growing.

Jordan: What was the training process?
Gautam: I sent them to formal Oracle training, but they were still worried that they didn't have enough experience to jump into a production environment. They knew what to do, but they didn't know exactly how to do it. So when they saw DBArtisan, it was a great help to them. DBArtisan gave them a good way to navigate through the print schemas and objects in Oracle. Before they would execute a statement, they could see the SQL running behind it. That was a big asset. They were able to get up to par quickly. Within two months, I could put them on the production support rotation.

Jordan: So you realized two benefits. Your team made fewer mistakes, and you were able to move people into a production environment more quickly.
Gautam: Right.

Jordan: Did you have to sell the idea of buying database administration tools internally?
Gautam: When I saw the number of databases growing rapidly and a lot of new applications on the horizon, I talked to my manager and told him how difficult it was for DBAs to work at the command line and the risk of mistakes and losing time and losing important data. I told him that we could not work that way in a production environment.

Jordan: Was cost an issue?
Gautam: The cost is peanuts compared to what we paid for other tools. I did a demo showing the benefits. Then I did a justification document showing the benefits.

Jordan: Have you looked at any other tools?
Gautam: We have BMC for our monitoring infrastructure, and we have installed some of those tools. And we have Oracle tools through our enterprise license. I use Oracle for the export and import of information. But I have found DBArtisan to be the most flexible tool.

Jordan: Is there anything else on your shopping list?
Gautam: At this point, I don't see the need for any other tools.

Jordan: When you first came in, the number of databases was exploding. Now we are in a period of consolidation. How have you managed that?
Gautam: Even though the scope of several projects was reduced, the databases still have to be managed. I haven't had to delete a single database or take any off support because an application has been canceled. Although the number of databases has stopped growing, I still have 100 databases to manage.

Jordan: So what is the day-to-day situation?
Gautam: A lot of development work is still going on. We get a lot of requests on a daily basis to make schema changes and to import and export refreshes. The 100 databases are on 50 servers, and it is impossible to log onto each box. DBArtisan has given us a central location to manage that.

Jordan: How do you manage the infrastructure?
Gautam: We have a database to manage the other databases. We have scripts that feed information into our central database, which is the repository of all our jobs information. We query that every day and have a GUI interface. To edit that, we use the edit function in DBArtisan. Before, I had to write SQL statements.

Jordan: How many more databases could you manage before you would have to add to the head count?
Gautam: I think that we could grow about 20 percent before we have to add more people.

Jordan: Do you plan to add any new database platforms?
Gautam: We do have two Microsoft SQL Server databases and a Sybase database. We are developing an enterprise license with Microsoft, so I see the number of SQL Server databases picking up. But we can manage that with DBArtisan as well.

DBTA: What kind of challenge adding different database platforms present?
Jordan: We will have to cross-train people about how SQL Server does things differently than Oracle. But the basic database administration concepts are the same. I think I will be able to do it with internal cross training.

Walt Jordan is a regular contributor to DBTA; write him at Walt@dbta.com.

back to top

DB2 Version 8.1 Goes to General Release
by Billy Rosario

IBM announced that DB2 Version 8.1 went into general release in late November. The release represents an important milestone in the technology roadmap IBM has laid out for its data management infrastructure solutions and autonomic computing initiatives. The new database software is designed to help companies simplify and automate many of the tasks associated with maintaining databases, as well as delivering the broadest support for open standards, enabling customers to manage, integrate, and analyze information from the widest variety of sources to gain a greater return on their investment.
...
To understand what the general release of DB2 Version 8.1 means to IBM and understand the company's vision for the future, DBTA contributing editor Billy Rosario talked to Dr. Patricia Selinger, IBM Fellow and vice president of data management architecture and technology. A pioneer in the field of relational databases, for 12 years, Selinger directed IBM's Database Technology Institute. In 1999, she was elected to National Academy of Engineering for contributions to the field.

DBTA: What are you most excited about in this DB2 release?
Selinger: There are number of things that are important to our customers. We have a number of technology pieces in Version 8.1 that really help cut down the amount of time a database administrator has to spend maintaining and tuning the database.

DBTA: Why is that so important?
Selinger: As databases grow in size and the need to connect data together grows to keep companies competitive, a DBA's job keeps growing. Anything we can do to help with the total cost of ownership by reducing the amount of time that a DBA takes to do certain tasks or eliminate them altogether, that is a wonderful thing. Look at something like the Configuration Advisor in Version 8.1. This is a set of questions that a DBA spends about 20 minutes answering. We then come out with recommendations based on expertise built in by experts at our development labs and our performance team for the configuration parameters. We have tested this against the tuning we did for an OLTP benchmark and we came within 91 percent of the experts. For 20 minutes of work compared to weeks or months of work by the experts, this is a substantial savings.

DBTA: But still, is it good enough?
Selinger: For many of our customers, it is very close for the kind of workloads that they run. For the customers who are on the leading edge and want that last ounce of performance, this is a very good starting point. They can do hand tuning from this point on. Some of our real customers got better performance than from the experts.

DBTA: There is some other new technology aimed at BI applications too.
Selinger: We have added multidimensional clustering at the physical storage level, which nobody else has. We have applied for a patent for this technology. It concerns the ability to store data and cluster it in many dimensions at the same time. For any user request, you can go straight to the data that qualifies for your query. It is an advantage for people who build warehouses. With most warehouses, you are not absolutely certain how people are going to query it. You get surprises. But rather than having to reorganize your data, you can organize it across multiple dimensions.

Want to learn about Pat Selinger's view of the next stage of evolution in information integration? Click here.

back to top

 
 

Home | Subscribe | About Us | Contact us | Partners | Front Page | In-Depth | Events | DBTA Think Tank | 5 Minute Briefing | Demo Central
Media Kit | Insertion Order | Mechanicals | Advertising Opportunities | Advertising Rates | Reader Profile | Submit Editorial

© 2002 Unisphere Media L.L.C.