Hello community,
I would be very interested by return of experience regarding split of huge Jira DC instance. In our case, we are studying possibility and way to split a huge Jira DC instance in 2 instances (globally thousands of project and millions of issues).
We think to duplicate the initial instance. In the new created instance we'll have to keep several hundreds of project but also to clean thousands of project including issues and attachement.
Using APIs to remove projects seems to lead to several weeks of processing.
As said in introduction, we'll be glad to have a return of experience on the same situation and also advice on potential usage of existing App (if any) in the marketplace.
Thanks
We did that by generally restoring a separate instance from SQL, and then we archived projects from both instances so as to "split them".
There's ofc a ton of other stuff that has to be planned ahead. For example user licenses, which is probably a major point to consider also (don't need to license users on both if they only need one eh); permission schemes for archived projects (prevent browsing by end users to avoid confusion); redirection for load balancer to better ping pong users for some time after the split, any other 3rd party stuff such as eazyBI db/jvm, scriptrunner script roots, any maintenance scripts, monitoring, alerts, integrations, everything.
I can't really explain each individual step because that would be kind of too specific, and I'm heavy on API and automations and all of those are in their final form under nda and can't be shared. Most I can say is, you can't really use any apps to migrate stuff, SQL restore is de facto the only way to do it in my opinion, you'd otherwise be migrating for several years going project by project and SQL ensures no internal IDs are changed so it's less "breaking".
Rollback would be possible in theory because the original instance has all the data until you either delete or modify it. It's also much more reasonable to migrate projects individually later down the line if needed, rather than trying it for hundreds/thousands at which point it would thrash any underlying data anyway due to import mapping, changing IDs and everything.
What you need is people who understand your systems in great detail to plan things through, and techies that can prepare and test all the scripts as needed. Not simple, but doable.
Hello Radek,
First, thanks for answer.
Your feedback confirm our approach of database duplication and complexity of the data model.
To precise, my concern is more about time to remove thousand of project that we cannot keep (even archived) in both instances (globally users of one instance will not be allowed to access data of the other as it will be property of an other company).
Thanks !
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
SQL sounds like your only option, if it's prepped really really well - that'd probably be the fastest. Then again, a lot of 3rd party apps might not like it if not accounted for. Nobody likes nullpointers popping up in random places.
Purely JIRA data isn't that complicated to wipe out - with an additional data mining to collect attachment directories to wipe as well. There still could be other AO_ tables with some value references practically anywhere (and project delete might not even remove them to begin with).
I don't like the SQL approach, it's a risky one way street. I see a lot of problems with the "cut both parties off" way - first, it means you would need to delete ALL of your previous backups, otherwise you wouldn't be compliant. Meaning that once you decide to go with the split, you basically wipe your backups as well as nuke your DB and hope that everything works. So you see that practically, this is a problem. Just removing the data from Jira is not the only step, if you don't delete everything then Jira doesn't matter, could always restore from backups.
Once you split that data to another company, then they would need to the same thing with different projects there, right? Otherwise you only copy over an already "partially" deleted sql backup. If you don't have the original "full" backup, you see where I'm going with this.
There's got to be a better way to handle it. Currently whichever company (assuming you) owning the instance has all the data, attachments, backups. I'd like to think that you should be able to hold onto it for the necessary amount of time until you are able to safely remove that data without a risk to your operations - because blindly nuking everything and hoping it sticks isn't really a business viable strategy for either of those parties.
Overall I'm coming to the conclusion that if you cannot hand over the original backup, then you'd have to either sanitize or delete the data from a SQL backup, and provide only selective attachment directories - thus keeping the original, handing over data for the other party.
What I'm hearing are two problems:
- you can't really delete projects the "standard" way, because it takes too much time, to solve this you need a longer period to continually delete them until they're gone, that's a technical problem, maybe Atlassian can provide a better way to delete the projects in a faster time than the standard webUI offers
- if the whole backup gets in hands of a different company, none of that, strictly speaking, matters, because as soon as the data touches their ground, who's to say they don't copy it elsewhere or that they do their cleanups?
- if you can only provide a cleaned up "split" backup and attachments (which strictly speaking sounds like the only allowed decision), then you have to keep the original data for at least some time, otherwise if you mess up at any point you're going to break both systems with no backups and nobody wants that
"Delete from both ends" doesn't sound like a logical solution to this and it puts a lot of risk for both companies - who wants to argue about who's fault it was if things go south and what would the damages be? One of you will need to keep the data for at least some time before it can be properly and safely removed.
I assume SQL is the only way in either case. Maybe sanitization could play part in another (e.g. users). So yeah, sounds like a ton of SQL prep to delete or sanitize any critical data, whichever decision there is. Can't say I'd know what specific tables (or 3rd party AO_ tables) that covers, but either someone on your team is familiar with it (huge instance, you practically live in SQL so somebody's got to know their way around), or Atlassian could likely provide the queries seeing as manually deleting doesn't really work.
Sorry, most I can think of right now.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Online forums and learning are now in one easy-to-use experience.
By continuing, you accept the updated Community Terms of Use and acknowledge the Privacy Policy. Your public name, photo, and achievements may be publicly visible and available in search engines.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.