Saturday, December 5, 2009

TSM with Content Manager

Sorry for the complete lack of posts here lately. My business, Transcendent Technology, has been doing fantastically. That part is great as it means a good income for me. The downside is, of course, that I have been working some really long weeks. Many 14 hour days and 7 day weeks. That does not leave a lot of time to post info on here. Thought I would take a shot though and get some new info on here. No one is asking questions, so I will just talk a little of my basic background in TSM and them talk of some issues on upgrading to new versions.

My main usage of TSM is in conjunction with Content Manager and Content Manager OnDemand (see my OnDemand blog at http://contentmanagerondemand.blogspot.com/). IBM has leveraged the TSM product to provide a storage repository for these products. I will use an old client as an example (no names to protect me!). I installed Content Manager for this client several years ago. I believe it was in the early days of version 8 (horror story of it's own). The client insisted they did not want TSM installed with the product. Against our better judgment we finally agreed. About a year later, the client had almost a TB of 40K files stored on their hard drive. The hard drive got corrupted and the corruption was written across the Raid. The entire TB was ruined. They started restoring this data from file system backups. Took them 30 days to get this finished. I found out all this as I was up there installing a new version of Content Manager, with TSM support. I was also running queries to get the current documents to move from disk to TSM.

So why is TSM such a plus in this situation? The way that IBM has leveraged TSM is to use the API's and backup the documents from the disk drive to storage pools in TSM. The storage pools can be on any supported hardware, although most of them are on dasd. Once the document has moved to TSM it is deleted from disk storage. In fact Content Manager can be configured so that the document goes straight to TSM and never is stored on disk. If the storage pool volumes are on dasd, this is even the recommended approach when I install Content Manager.

But the real advantage of TSM in conjunction with Content Manager or OnDemand is the idea of storing the multitude of small files that are the usual report data in these two products into large files in TSM. So, with the client I mentioned earlier, as I moved their 40K documents into TSM, I configured 20G storage pool volumes. One TB would be 50 of these volumes. Now the file system backups run quicker, having only to backup 50 files instead of millions and a restore would take much less time. SO why does it take less time to restore 50 20G files as opposed to the same amount of disk usage in small files? Because of opening and closing all the small files. There is system overhead for opening and closing the files as they are restored. Opening a single 20G file and streaming the data back into it takes a fraction of the time as would restoring 20G of 40K files. The same is true when doing system backups. The larger files backup much quicker than a bunch of small ones.

So bottom line, if you are using either Content Manager, or Content Manager OnDemand installing TSM to store the documents and reports is a no brainer to keep the file system in a manageable state. If you are not storing enough documents in either of the products to really need TSM then you should not have these products. Go for a lower end product.

Next post (whenever it may come) will cover the part on upgrading a system that I mentioned earlier in this one.