CIO

CRM backups or audit trails? Yes, please

You'd think that either thorough backups of your CRM system or full audit trails would be enough to keep you out of trouble.

Like nearly any transactional system, usable CRM data backups are tricky because the data is always changing and dependent on coherency across several tables. Ideally, you'd fully quiesce the system and do a full backup every day, or enable the online backup. But with modern cloud systems and 7x24 customer access (via portals or mobile apps), you can't take the system down, and cloud vendors like SFDC don't provide a full backup more often than once a week. The situation with audit trails is different: they may reliably capture all the changes, but you may only be allowed to track a limited number of fields (the default in SFDC is 20 per object).

Yum ... delicious oatmeal you've got there.

Let's take a top-down view of the problem space and its use-cases to explore solution strategies for this kind of backup and archival data:

Disaster recovery/business continuity. Most enterprise SaaS vendors provide this as part of the service. They have the multiple operations sites and provide a system image with very solid availability numbers. They have lots of infrastructure behind the scenes to support this, but it is specifically designed for their own DR/BC needs, and customers can't access it for their own data loss issues. Even if you had all the metadata, data and system configuration information on your disks, you won't have access to the cloud software. So it wouldn't do you much good over the short run. For this use case, I recommend that you don't try to do it yourself; instead, buy more assurances from your cloud vendor.

File Insurance/ability to migrate/checkpoint full-system image. The serious cloud systems provide either data export tools or fast APIs to pull data. You might have to upgrade your version or pay for extra API calls, but these methods really work. There are a dozen different tools and strategies available at moderate cost for SFDC. The classic approach is to run some tools in the middle of the night daily or on Saturday; the data may have a few updates during your backup cycle, but the metadata almost surely won't. That said, in SFDC the API and tools for pulling data are different from the ones you use for metadata. And there are a couple of key tables that aren't available at all via the API. If you're completely fixated on getting a faithful reproduction of the entire system state, you'll need to buy a Full Sandbox and refresh it as often as it will let you.

[Related: How to create a robust backup strategy with cloud services]

Unfortunately, that's once a month so if you want weekly, absolutely complete backups you'll need to buy four of them. A refresh of a big system may take a day or more, so make sure to start the process as soon as things slow down on Friday night. Even with that, there are a couple of hidden tables that aren't faithfully reproduced and you'll have to do some additional manual steps. There's a cost for this perfectionism be prepared for sticker shock.

Archive copies for audits, compliance and legal discovery. I am an expert witness, and I can assure you that almost nobody keeps all the data they should to help them prosecute or defend a legal matter three to six years in the future. And yes, that's typically when it will happen. You want to make sure you have data to prove your point in wrongful-termination, IP theft, abusive sales tactics, stockholder complaints and other suits that may come your way.

[Related: 12 ways to disaster-proof your critical business data]

The first level of defense is a complete image of data and metadata on write-once media (that assures nobody tampers with/discredits your data). A weekly data and metadata export saved on a DVD or two will do the trick--and don't throw them away until they're seven years old. There are also automated third-party cloud backup solutions to address this requirement. But then there are those two SFDC mystery tables I've mentioned above that aren't available via the API. These you can dump manually: the login history and setup audit trail can be exported to a CSV file and must be exported to write-once media at least every 180 days (as data older than that falls over the storage horizon). So what do you do if you've got a legal action and you didn't store the archive images?  Well, in the land of SFDC they may be able to recover it from their internal data streams (used to populate their DR sites, etc.). But to get that recovery done is a consulting project that may end up costing thousands of dollars per relevant object recovered. And that's if it can be recovered. No, I am not kidding. So...go back to the top of this section and do the right thing from this moment on.

Audit trails to reconstruct the evolution of records. All the discussion so far has been about static snapshots in time, and the techniques provide little in the way of timeline or sequence of changes. The "whodunit" analysis requires field name, before value, after value, changerID and time of change. Salesforce does provide field-level audit trails that fill the bill nicely, but by default you get only 20 of them and they're not guaranteed to stay around forever...so make sure you backup those history tables along with the rest of the data. But even if you have all the limits removed or write your own audit-trail enhancements, audit trails are only really practical for analytics. I know of no products that can use audit trails to "play the system state forward" from a backup image to a specified point in time. I'm sure you could write something for that and maybe your data warehouse team would love this project, but it's a significant R&D effort. The situation is far bleaker regarding metadata the changes aren't recorded with enough detail to "play the system configuration forward" from a snapshot.

Audit trails to roll-back a data corruption. Again, SFDC can be configured to record all the data elements you need for this, but none of it is turned on by default and you may need to write some code or use a third-party solution to record absolutely everything. But as stated above, audit trails and logs by themselves do not a roll-back make, particularly for something as messy as an erroneous merge of accounts or a mass deletion with cascading effects. I know of no products or general procedures that would work in all situations; you would need to do some analysis to determine the right approach. In some cases, using the most recent backup to return all the affected records to "last Saturday night," and then applying all the relevant updates since then is best. In others, you do "subtractive work" on the currently corrupted data. In either case, un-scrambling corrupted records requires relentless attention to detail. It can be a very risky procedure if you try to do it on your own without the right backup and recovery tools. 

Bottom line

Cloud systems are inherently loosely coupled, and many of the vendors are still maturing their functionality in the backup, archive, and audit trail areas. Even with third-party products, you cannot take for granted you can get everything you'll need. The first order of business is develop an archive and audit-trail plan, with a process that is quite detailed and relentlessly followed, because no product can overcome sloppy execution by the sysadmin staff.