Import CSV with PHP to update MySQL data

I have written a very simple script to perform a useful function for me. I figured I’d share it so that others can make use of it.

If you have a CSV file with the first row containing headers and the subsequent rows containing data that you wish to update along with an ID column, then this will work for you without any issues. If your primary key is different or you wish to match other criteria you may need to adjust the script a bit. This does not facilitate upload but only uses a csv file residing on the same directory as the script.

MySQL zerofill and lpad: whipping your digits into shape

I’m working with a large dataset. One of the columns is set as an INT(8) but not all of the values are 8 digits. I need to run a query that will sum values and group by the first two digits in that INT(8) column. This presents a potential problem since it may end up grouping incorrectly since I need everything to be grouped based off of 8-digit numbers. How do you make these smaller numbers into 8-digits?

Simple. There are two ways.

  1. Alter the table itself using the ZEROFILL in MySQL. This will add 0s to the left of your values up to the max number of digits defined for that column:
    • ALTER TABLE [table name] MODIFY COLUMN [column name] INT(x) ZEROFILL UNSIGNED;
    • Where table name = your table name, column name = the desired column to pad, and x is the number of digits allowed for that column
  2. Pad the values using lpad() in your query itself, without altering your table:
    • select lpad(column name,x,'0') from table name;
    • Where table name = your table name, column name = desired column to pd, and x = the number of digits to allow.
    • If your value in the DB is 332, your result will be 0332. If your value is 9, the result will be 0009.

Hopefully this will help someone facing the same problem I was. It’s also good to know that there are multiple ways of doing things.

Tuning MySQL for performance with Ubuntu 11.04 and MySQLTuner

One of my current projects requires extremely heavy use of very large data sets housed in sever MySQL databases and spread across several servers. This data feeds various web apps, business processes, and fulfills many requests. It is data that needs to be highly available at all times and return queries in the shortest amount of time possible. There are several database tables with upwards of 15 million records and in some instances these need to be joined to other tables, used in calculations, etc.

I need results now, not 48 hours from now. Performance is essential but hard to come by with legacy servers made out of repurposed machines. In order to see where the bottlenecks are and try to make the best of the situation I decided to do some searching. We’re running Ubuntu 11.04 servers for this purpose and I needed something that would give me a run down of what’s causing performance issues on each server.

That’s when I found mysqltuner. This handy little guy is a Perl script that looks at your my.cnf and other MySQL installation data and makes recommendations about how to improve performance based on your past usage.

It’s easy to install in Ubuntu server 11.04:

sudo apt-get install mysqltuner

To run it:

sudo mysqltuner

Then simply enter your administratrive user and password and you’ll get a nice little printout like so:

image[2]

This will give you a good set of recommendations about which settings to tweak. From here you can make tweaks, restart MySQL and run the script again to see where you stand.

Hope this will help someone in the same situation!

Missing orders in Virtue Mart admin and how to fix it

I just realized that none of the recent orders in one of the Virtue Mart installations I manage were showing up. It looks like the most recent one was….three months ago. But when I search for a more recent order I get the correct results. So, the orders are in the database but they weren’t being listed on the main order list. This will go over what I’ve found in this install of Virtue Mart 1.1.2. Other versions may not suffer from this phenomenon.

Weird.

After a little digging I find that the order.order_list.php file in the /administrator/components/com_virtuemart/html folder has what I need to start my investigation. Starting at line 26 there is a bundle of SQL that joins the jos_vm_orders table to the jos_vm_order_user_info table. Well, since this is an administrator-only issue, I just echoed the $list variable at line 46, reloaded the order list in admin.

I then copied the query that was printed at the top of the page, replaced the #__{vm} with my installation’s table prefix of jos_vm and pasted the whole query in my MySQL tool of choice to query my database. Well, turns out the query isn’t returning an up to date result set. I know that the data for orders is being saved to jos_vm_orders because I can look at that table and everything is up to date. So next I move on to the table the query joins it with: jos_vm_order_user_info.

Aha! It looks like jos_vm_order_user_info wasn’t up to date at all. Weird.

According to this forum post it’s because there were custom fields added to the jos_vm_user_info database table and the jos_vm_order_user_info table no longer matched the columns. So the solution was to see what columns didn’t match up and alter the jos_vm_order_user_info table to have those missing columns. Turns out I added a single custom field and it didn’t get added to the order user info table…so after adding the column to the order user info table everything worked perfectly.

So, if you added custom fields to your user registration you should make sure your jos_vm_user_info table matches up those custom fields with the columns in jos_vm_order_user_info.

Find and Replace a letter/number and batch update with MySQL

I had the need to do a batch update of a large MySQL data table that held thousands of products and their information, including SKU numbers. Recently the company has changed how they manage the SKU numbers and needed to replace the leading ‘2’ with a ‘4’ for simple accounting tracking.

It was easy enough to use REGEX to find which ones started with ‘2’, but I was still unsure of how to replace the ‘2’ with a ‘4’ in all of the necessary records.

I knew that if the SKU started with a ‘2’, that it had to be changed to a ‘4’ so I started out using the IF() function to compare SKU’s as follows.

SELECT sku, IF(TRIM(LEADING '2' FROM sku) <> SUBSTRING(sku FROM 2),sku,INSERT(TRIM(LEADING '2' FROM sku),1,0,'4')
)
FROM product

So what this does is select all the SKU’s in the table and return the ones that need to be changed with what the change would look like. The first TRIM() function takes all of the SKUs and trims off the first 2, then it is compared to a substring of the original SKU starting at the second position. If there is no leading ‘2’ in the SKU, then it will not match the substring since nothing was trimmed.

The next argument is telling the function what to return if the TRIM() and SUBSTRING() values differ. It will return the original SKU since we don’t need to change it if it doesn’t start with ‘2’.

The last argument is telling it what to return if TRIM and SUBSTRING are equal. For this value I’m trimming the leading ‘2’ and INSERTing a ‘4’ at the first position of the SKU.

The above query lists the original SKU and what it should look like once I put it into an UPDATE query.

The UPDATE doesn’t look much different:

UPDATE product
SET sku =
IF(
  TRIM(LEADING '2' FROM sku) <> SUBSTRING(sku FROM 2),sku,INSERT(TRIM(LEADING '2' FROM sku),1,0,'4')
)

This actually updates the table with the changes that I wanted. There may have been a more elegant way to do this with REGEXP but this did the trick. I verified that only the SKUs with a leading 2 were changed to a leading 4 and the others were left as they were.