Merge Joomla sites

Merging two joomla sites into one

This comes up as a job every now and again and its a bit nerve wracking so here's my quick hackers guide to doing it. Don't expect to follow this verbatim but it should help you get the job done. You will need to be working at the linux command line and familiar with mysql client. Let me know if there is anything that needs clarifying or explaining further.

If you need my help, don't hesitate to contact me!

The editor has garbled the command line statements below - I've done a quick fix but there may still be errors. Let me know if so and I'll fix them up.

If you are doing this as a client service, take care with making any promises until you've done a dry run. If it is straightforward it will all be done in an hour. If it proves difficult, you could end up spending days on it. The key things that move it from 'straightforward' to 'difficult' are conflicts in extensions and hacks to Joomla itself. These are most likely to involve login and authentication issues or cached references to article ids.

Before you begin

Before you start you need to have both sites up and running on a local server. You should also ensure that at least one of the sites is frozen while you experiment so that there are no changes being made to the online site. We'll take the other 'live' site down for a minimal amount of time (hopefully less than an hour) but will probably need to suppress article editing/addition for longer.

Remove anything that you can from the sites. This might seem like a strange remark but most functioning sites have a number of redundant templates and possibly extensions that are not used or are no longer required. Cleaning this stuff up before the merge will help to reduce the likelihood of problems.

If one of the sites is broken, there is a possibility that the merged site will also be broken. There are too many ways that this can happen to consider here so you'll need to spend time investigating and that time would probably have been better spent getting the site working (possibly by removing large chunks of it that may perhaps be provided by the other site).

If one of the sites is in an earlier version of Joomla then update it to the later version. Make sure both the local versions of the sites are at the same Joomla version and at the same plug in or add on versions where they share any plug ins or versions.

Its best to consider one of the sites as the 'target' (pick the one with the most complicated menus and the largest number of articles) and to copy the other 'source' db into it.

There are two major things that we have to merge. These are the files making up the sites and the databases that serve the sites. The files are generally fairly easy because Joomla is (mostly) a data driven system. The databases can be a bit more tricky.

A major area of concern is likely to be the login facilities. Unfortunately many joomla extensions and tools use incompatible login processes and this can require quite a bit of work to sort out. If in doubt try it out but if the login causes problems you'll need to rewind and see if you can remove the conflicting systems from one or both of the sites before reattempting the merge process.

We also have the articles to worry about. Before you merge, review the sections and categories and ensure that if they are named the same in the target and the source you really want them to be the same (if not rename one of them). Clean out all the trash.

Delete any modules, plug ins etc. from both databases. Anything you don't need. The less fluff left around to distract us, the better.

Merging the file systems

If you are using source code control things are likely to be easier. I'm still using subversion but I have a suspicion that new users will be better off with bazaar. If you are not using source code control then you probably should be. Even a one man show benefits from the ability to compare file versions and keep track of all of the changes made to the files over time. There is an excellent deployment and repository service available too at Springloops - it saves me time and money.

Probably the quickest way to merge the files in most cases is, once the versions are all matched as described, to copy the source into the target skipping any files that exist already in the target. When you try this make sure you can easily rewind either with source control or with a backup of the directories involved. If your target still works after this then you are looking good. If it doesn't then its most likely you need to back up and get the versions matching for everything.

If you are using ubuntu linux you probably already have rsync available to you at the command line. This is a very powerful and very fast copy utility that can do just what we need. Spend some time reading its man pages and perhaps some of the online tutorial stuff because its a utility you will almost certainly need for other jobs:

rsync -a --ignore-existing --progress   \
Example: ~$ rsync -a --ignore-existing --progress \ ~/websites/source/ ~/websites/target

At this stage your target will have inherited some of the features of the source. You can switch to the template you want to use if it existed in the source and not the target but you wont see the source modules or menus. The key thing at this point is to make sure nothing seems to be broken. If you are using source control, you can review the additions that were made in this process. Some files, particularly css and the like in the templates may require merging manually.

If you are using source code control (which you should be) you now need to add the files to your repository so that they will be properly deployed. For subversion you can do this as follows:

# Add all new files to svn.
# The first grep assumes you don't want to control hidden files
# in the root (commonly present if you
# are using Zend studio, Dreamweaver etc. assuming you haven't told
# svn to ignore them already)!
svn status | grep -v "^.[ \t]*\..*" | grep "^?" | \
awk '{print $2}' | xargs svn add
svn commit

Merging the databases

This is the fun bit because we have to carefully merge the tables that contain the user and article information. To make our life easy when rebuilding we'll also merge tables that contain menus and module information. The end result will look pretty awful and will have conflicts where modules are fighting for position but it should be something that we can quickly tidy up in the administration back end of the site. Everything should be functioning fine when we complete the merge.

To merge the databases we're going to repeat a broadly similar task for each table in the source so that it can be added into the target. mysqldump is a great utility for exporting information and we'll take advantage of it but unfortunately we can't tell it to only export certain columns and in order for our export to import appropriately into our target we need to make sure that it does not carry any 'id' columns with it. The id columns will almost certainly conflict with the range of ids already present in the tables in our target.

First we can simplify our job by directly copying any source tables that are absent from the target. Notice I'm now talking about the source and target databases, we're done with the files.

Lets use the command line to create a list of the tables that are in source but not in target and a list of tables that are in source and in target:

mysql  -e 'show full tables where Table_type = "BASE TABLE"' |
  awk '{ print $1}' |
  grep -v '^Tables' > target_tables;
mysql -e 'show full tables where Table_type = "BASE TABLE"' |
  awk '{ print $1}' |
  grep -v '^Tables' > source_tables;
cat source_tables | grep -vFf target_tables  > tables_only_in_source;

Now we can dump the 'tables_only_in_source' directly into a sql script and then import them into target.

cat tables_only_in_source |
xargs mysqldump  > tables_only_in_source.sql;
mysql  -e "source tables_only_in_source.sql;"

All done for those! They were the easy ones.

Now we have to deal with the tables that exist in both systems. If Joomla used foreign keys in mysql to maintain integrity this would be easy - we could just modify the ids in one database to all be outside the range of ids in the other database and then append the records into the relevant tables. Sadly, Joomla doesn't do this for the very good reason that it complicates the deployment of the joomla system.The problem we face is that category IDs, article IDs etc. are used in various tables and if we change the ID we need to ensure that the ID changes everywhere - otherwise we'll have menu items referencing the wrong article or having the wrong author etc.

This problem is compounded by the embedded use of IDs in things like the menu links. As far as code is concerned, this link field is just text. Embedded in it we have article ids, category ids etc.

My goal here is to keep things as simple as possible. At this stage we'll bring all the required tables into a single database to make things easier to work with. In my case, as the bulk of the articles are coming from the source I've chosen to make the source jos tables replace the target jos tables. You could equally well do this the other way around.

A cute trick I'm going to use here is to use the command line to write me a bash script which will create a copy of each table I need copied in the target db. I need a command like "mysql -e "select * into old_ from " to execute for each of the tables in my "tables_in_target_and_source" file. Now I could dive into bash and write a loop to iterate through each line of the file and execute the relevant bash command - its a perfectly reasonable technique and I'd use it happily for a more complex problem but for a quick solution, here's the sed way to build the script we need. This approach has the advantage of creating a robust simple script that is easy to visually confirm as correct. If it looks like hieroglyphics, google or use "man sed" to see what its doing.

cat tables_in_target_and_source | 
sed 's/\(.*\)/mysql  \ -e "create table old_\1 select * from \1;"/g' \
  > make_copies;
./make_copies;

Now we can copy these tables over into the target without losing any of the old data we'll need to integrate the existing target articles.

cat tables_in_target_and_source | xargs mysqldump  \
> tables_in_target_and_source.sql
mysql sprdevc_development -e "source overwrite_tables.sql;"

By the way, whenever I create a file in these workings, I usually use vim to check it before proceeding. As everything is text and readable, this is a valuable check in ensuring that everything works first time.

Now for the biggy - lets take a look at our master local website! It should look like the source website. Behind the scenes its got all the data from both sites but as all the jos tables are now those from the source.

Lets try the users next - notice that the logins from users on the source should work fine but logins from the users of the target system will fail.

We'll go to the mysql client interactively to sort the users out. The complexity of this will depend entirely upon which applications you have on the systems and whether those systems make use of the user ids or not. With the lack of referential integrity inherent in Joomla, there is no option but to go hunting. The following describes my particular hunt on my most recent merge. Yours will be different.

-- add the old users back into the jos_users table
INSERT INTO jos_users (name, username, email, password, usertype, block,
        sendEmail, gid, registerDate, lastvisitDate, activation, params)
SELECT
                old_jos_users.name,
                old_jos_users.username,
                old_jos_users.email,
                old_jos_users.password,
                old_jos_users.usertype,
                old_jos_users.block,
                old_jos_users.sendEmail,
                old_jos_users.gid,
                old_jos_users.registerDate,
                old_jos_users.lastvisitDate,
                old_jos_users.activation,
                old_jos_users.params
FROM
                old_jos_users;

-- put their ids into a map table so we can fix up the other references
CREATE TABLE old_jos_users_id_map
SELECT
                jos_users.id AS new_id,
                old_jos_users.id AS old_id
        FROM
                jos_users
                JOIN old_jos_users ON
                old_jos_users.name = jos_users.name AND
                old_jos_users.username = jos_users.username AND
                old_jos_users.email = jos_users.email AND
                old_jos_users.registerDate = jos_users.registerDate;

-- After perusing tables that might have user IDs in them, use the map
-- to add in appropriate rows for our old users
-- old_jos_contact_details empty - no work needed.
-- jos_core_acl_aro needs our old users added back in and their aro id
-- needs inserting into jos_core_acl_groups appropriately

INSERT INTO jos_core_acl_aro (section_value, `value`, order_value, name,
            hidden)
SELECT
                'users' AS 'section_value',
                old_jos_users_id_map.new_id AS 'value',
                0 AS 'order_value',
                old_jos_core_acl_aro.name AS 'name',
                0 AS 'hidden'
        FROM
                old_jos_core_acl_aro
                JOIN old_jos_users_id_map ON
                   old_jos_core_acl_aro.`value` = old_jos_users_id_map.old_id;

INSERT INTO jos_core_acl_groups_aro_map (group_id, aro_id)
SELECT
                old_jos_core_acl_groups_aro_map.group_id AS group_id,
                jos_core_acl_aro.id AS 'aro_id'
FROM
                old_jos_core_acl_groups_aro_map
                JOIN old_jos_core_acl_aro ON
                   old_jos_core_acl_aro.id = old_jos_core_acl_groups_aro_map.aro_id
                JOIN old_jos_users_id_map ON
                   old_jos_core_acl_aro.`value` = old_jos_users_id_map.old_id
                JOIN jos_core_acl_aro ON
                   old_jos_users_id_map.new_id = jos_core_acl_aro.`value`;

-- jos_juga_u2g maps user ids to JUGA groups. These need to be changed to
-- reflect the new ids
-- This is rendered a little bit painful by the face that the id sets overlap
-- creating the possibility
-- of 'temporary' duplicate key entries during the update process (there
-- could be some issues here but this
-- works so I'll move on for now).

ALTER TABLE jos_juga_u2g
DROP PRIMARY KEY;

UPDATE jos_juga_u2g
JOIN old_jos_users_id_map ON jos_juga_u2g.user_id = old_jos_users_id_map.old_id
SET jos_juga_u2g.user_id = old_jos_users_id_map.new_id;

ALTER TABLE jos_juga_u2g
ADD PRIMARY KEY (user_id, group_id);

-- virtue mart has jos_vm_auth_user_group but for me virtue mart was
-- only in the source so no mapping required.

Now logins for users on the old system should work and all of the JUGA access control should work (although testing this latter part has to wait until we merge the content because right now JUGA will be protecting all the wrong articles etc.

Next we want to merge the article sections, categories and content. Luckily for me on this particular run through, the articles are pretty sparse in the target but the process is much the same in any event. We'll work in msql client again:

-- our sections and category names are unique and can be used to lookup
-- the relevant ids so we can just insert our old_jos_sections into the table
INSERT INTO jos_sections (title, name, alias, scope, image, image_position,
                          description, published, checked_out, checked_out_time,
                          ordering, access, `count`, params)
SELECT
                title,
                name,
                alias,
                image,
                scope,
                image_position,
                description,
                published,
                checked_out,
                checked_out_time,
                ordering,
                access,
                count,
                params
        FROM
                old_jos_sections;

-- Likewise our categories with the slight complication of needing
-- to get the new section id via its title

INSERT INTO jos_categories (parent_id, title, name, alias, image, section,
                            image_position, description, published, 
                            checked_out, checked_out_time, editor, ordering,
                            access, `count`, params)
SELECT
                old_jos_categories.parent_id,
                old_jos_categories.title,
                old_jos_categories.name,
                old_jos_categories.alias,
                old_jos_categories.image,
                jos_sections.id AS section,
                old_jos_categories.image_position,
                old_jos_categories.description,
                old_jos_categories.published,
                old_jos_categories.checked_out,
                old_jos_categories.checked_out_time,
                old_jos_categories.editor,
                old_jos_categories.ordering,
                old_jos_categories.access,
                old_jos_categories.count,
                old_jos_categories.params
        FROM
                old_jos_categories
                JOIN old_jos_sections ON
                   old_jos_categories.section = old_jos_sections.id
                JOIN jos_sections ON old_jos_sections.title = jos_sections.title;

-- Now the sections and categories are there, we can add the articles
-- themselves. First get the titles unique for convenience:

select title, count(*) from 
(select * from jos_content union select * from old_jos_content)
as tmp group by title having count(*) > 1;

-- Twink the clashing titles in the tables so we get an empty set from the
-- above query and then we're free to merge in the old articles
-- safe in the knowledge we can identify them by their titles. For example:

update old_jos_content set title = "Contact Us2" where title = "Contact Us";


-- insert the old content into the jos_content table

INSERT INTO jos_content ( `title`, `alias`, `title_alias`, `introtext`,
                          `fulltext`, `state`, `sectionid`,
                                `mask`, `catid`, `created`, `created_by`,
                          `created_by_alias`, `modified`,
                                `modified_by`, `checked_out`, `checked_out_time`,
                          `publish_up`, `publish_down`,
                                `images`, `urls`, `attribs`, `version`,
                          `parentid`, `ordering`, `metakey`,
                                `metadesc`, `access`, `hits`, `metadata`)
SELECT
                old_jos_content.`title`,
                old_jos_content.`alias`,
                old_jos_content.`title_alias`,
                old_jos_content.`introtext`,
                old_jos_content.`fulltext`,
                old_jos_content.`state`,
                jos_sections.id AS `sectionid`,
                old_jos_content.`mask`,
                jos_categories.id AS `catid`,
                old_jos_content.`created`,
                old_jos_content.`created_by`,
                old_jos_content.`created_by_alias`,
                old_jos_content.`modified`,
                old_jos_content.`modified_by`,
                old_jos_content.`checked_out`,
                old_jos_content.`checked_out_time`,
                old_jos_content.`publish_up`,
                old_jos_content.`publish_down`,
                old_jos_content.`images`,
                old_jos_content.`urls`,
                old_jos_content.`attribs`,
                old_jos_content.`version`,
                old_jos_content.`parentid`,
                old_jos_content.`ordering`,
                old_jos_content.`metakey`,
                old_jos_content.`metadesc`,
                old_jos_content.`access`,
                old_jos_content.`hits`,
                old_jos_content.`metadata`
        FROM
                old_jos_content
                JOIN old_jos_sections ON old_jos_content.sectionid = old_jos_sections.id
                JOIN jos_sections ON old_jos_sections.title = jos_sections.title
                JOIN old_jos_categories ON old_jos_content.catid = old_jos_categories.id
                JOIN jos_categories ON old_jos_categories.title = jos_categories.title;

OK, that puts all the articles in.

Now we need to configure the components appropriately. These will involve significant overlap and there is the possibility that they carry different configuration information. Its a pretty sloppy table so we just need to insert any elements that are not effectively already in the table.

-- Add any component entries that are not already present

INSERT INTO jos_components (`name`, `link`, `menuid`, `parent`, `admin_menu_link`,
                            `admin_menu_alt`,
                                `option`, `ordering`, `admin_menu_img`, `iscore`,
                             `params`, `enabled`)
SELECT
                ojc.`name`,
                ojc.`link`,
                ojc.`menuid`,
                ojc.`parent`,
                ojc.`admin_menu_link`,
                ojc.`admin_menu_alt`,
                ojc.`option`,
                ojc.`ordering`,
                ojc.`admin_menu_img`,
                ojc.`iscore`,
                ojc.`params`,
                ojc.`enabled`
        FROM
                old_jos_components ojc
                LEFT JOIN jos_components jc ON
                ojc.`name` = jos_components.`name` AND
                ojc.`link` = jos_components.`link` AND
                ojc.`menuid` = jos_components.`menuid` AND
                ojc.`parent` = jos_components.`parent` AND
                ojc.`admin_menu_link` = jos_components.`admin_menu_link` AND
                ojc.`admin_menu_alt` = jos_components.`admin_menu_alt` AND
                ojc.`option` = jos_components.`option` AND
                ojc.`ordering` = jos_components.`ordering` AND
                ojc.`admin_menu_img` = jos_components.`admin_menu_img` AND
                ojc.`iscore` = jos_components.`iscore` AND
                ojc.`params` = jos_components.`params` AND
                ojc.`enabled` = jos_components.`enabled`
        WHERE
                jos_components.id IS NULL;

In a similar manner, we need to include the plugins...

INSERT INTO jos_plugins (`name`, `element`, `folder`, `access`, 
                      `ordering`, `published`,
                         `iscore`, `client_id`, `checked_out`,
                      `checked_out_time`, `params`)
SELECT
                old_jos_plugins.`name`,
                old_jos_plugins.`element`,
                old_jos_plugins.`folder`,
                old_jos_plugins.`access`,
                old_jos_plugins.`ordering`,
                old_jos_plugins.`published`,
                old_jos_plugins.`iscore`,
                old_jos_plugins.`client_id`,
                old_jos_plugins.`checked_out`,
                old_jos_plugins.`checked_out_time`,
                old_jos_plugins.`params`
        FROM
                old_jos_plugins
                LEFT JOIN jos_plugins ON
                        old_jos_plugins.`name` = jos_plugins.`name` AND
                        old_jos_plugins.`element` = jos_plugins.`element`
        WHERE
                jos_plugins.id IS NULL;

OK, last but not least, I like to get some of the menus over. We'll need to fiddle about with the modules and presentation of them but the basic menus are likely to remain the same (or in any event be easier to edit than to re-enter from scratch) We'll extract the menus we want from the (misnamed) jos_menu_types as well as the (also misnamed) jos_menu so...

-- Get the menus we want to re-use to save me from having to
-- re-enter them in manually

INSERT INTO jos_menu ( menutype, name, alias, link, type,
                 published, parent, componentid, sublevel,
                 ordering, checked_out, checked_out_time, pollid,
                 browserNav, access, utaccess, params, lft, rgt, home)
SELECT
                `menutype`,
                `name`,
                `alias`,
                `link`,
                `type`,
                `published`,
                `parent`,
                `componentid`,
                `sublevel`,
                `ordering`,
                `checked_out`,
                `checked_out_time`,
                `pollid`,
                `browserNav`,
                `access`,
                `utaccess`,
                `params`,
                `lft`,
                `rgt`,
                `home`
        FROM
                old_jos_menu
        WHERE
                menutype = "staff-menu" OR
                menutype = "manager-menu";

From here, the content is all there but there will be presentation work required to access the target content effectively. That cannot easily be handled in code because its got too much to do with the visual layout of the assets on the screen).You may need to merge the style configuration sheets etc. by hand to get the final result.


Comments

Chris Wibbe 7 years, 7 months ago

Thanks! - just the jos_users?
Hi -

Thanks for writing this. I read through and I am trying to boil what you have done down to just merging the user table from one Joomla db to another. Maybe if I were a little more knowledgeable about MySQL I could whizz right through this but thats now the case. I need to get past the conflicting IDs. If there a way (preferably with phpMyAdmin)to export/import just the jos_users table and create new ID values for each row?

Link | Reply

Whippy 7 years, 7 months ago

To just do the jos_users I'm afraid you will still have a bit of work to do. First check your joomla extensions and add ons. If you have a relatively simple target db then you probably only need to concern yourself with the aro mappings mentioned in the main article. These are essential to give the users the access rights that are normally set up when they are created.

The trick with the conflicting ids in jos_users itself is to let the target db create new ids for you. I did this by the 'insert into jos_users...' statement in the main article - notice that it does not try to insert the old id. It leaves mysql to create new non conflicting ids. Unfortunately, you do then need to follow through on the aro tables at the very least to ensure that the newly added users have appropriate access rights when they login.

These commands should work fine in phpmyadmin. You can either run a query by pasting it into the sql box in phpmyadmin or by importing it as a script file.

Let me know if you need more help and perhaps we can meet up on IRC at a mutually convenient time so I can help you through the process. With your permission, I'll email you with the necessary details.

Link | Reply

Daniel 3 years, 10 months ago

I copied it to another site
I merged two sites without this once and remember from when i did it that because I was using a community plugin called jomsocial the users table was stuffed up and all i did was to get the extra tables it creates called jos_core_acl_aro / map / group etc...
But i guess the thing is do you have any social components and are there extra users tables with around the same amount of entries a jos_users.

Link | Reply

Daniel 3 years, 10 months ago

Error
theres an error at tables_in_target_and_source part where it doesnt work and stops you from creating the file hence the whole thing doesnt work, although i didn't need to worry about duplications really.

Link | Reply

Paul Whipp 3 years, 10 months ago

Hi Daniel,
Can you be a bit more specific. This guide was derived from a number of actual merges so it certainly works but if there is a mistake in the transcription above, I'll fix it.

Link | Reply

Daniel 3 years, 10 months ago

File Missing tables_in_target_and_source
This code is where the file will not create at all, or the major problem is it says an error file not found tables_in_target_and_source. so mabey that is meant to be a reference to different file or more likely theres something missing or this code just doesnt work it might be a version problem or a change in the syntax.

To Achive what I wanted to Acheive I didnt need to worry about duplicate content because I wasnt going to keep that version of the site because I had another version manually joined without the need to merge them, since i was creating it from scratch.

I just wanted to see if there was still things going missing due to some sort of bug using two templates which is another topic i guess. A drop down went missing trying to display K2 content categories templates.

cat tables_in_target_and_source |
sed 's/(.*)/mysql -e "create table old_1 select * from 1;"/g'
> make_copies;
./make_copies;

Link | Reply

New Comment

required

required (not published)

optional

Australia: 07 3103 2894

International: +61 410 545 357