Serialised data (WordPress): verschil tussen versies

Uit De Vliegende Brigade
Naar navigatie springen Naar zoeken springen
 
(38 tussenliggende versies door dezelfde gebruiker niet weergegeven)
Regel 1: Regel 1:
Soms bevatten database-velden ''serialised data'': Meerdere gegevens in één veld, die zijn samengevoegd middels een bepaalde syntaxis.
+
Sometimes database fields contain ''serialized data'': Multiple data in one field, concatenated using a certain syntax. Although the concept of ''serialized data'' seems quite at odds with some basic databases principles, there is an obvious application for: ''Storing objects''.
  
Je kunt zeggen dat serialised data nogal haaks staat op het concept van databases. Aan de andere kant: Het biedt flexibiliteit die anders soms niet mogelijk is.
+
However, compaired to e.g., YAML, this nameless protocol that WordPress uses, has one major issue: Before each string, the length of that string is specified. When you change the string but fail to update this string-count, the object is not readable anymore. See below for a solution.
  
== Voorbeeld ==
+
== Example ==
  
In tabel <code>wp_options</code> wordt in het veld <code>widget_woocommerce_layered_nav</code> alle Woocommerce-widgets gecodeerd. Dit is een voorbeeld van een webwinkel in okt. 2020.
+
In table <code>wp_options</code>, field <code>widget_woocommerce_layered_nav</code> encodes all Woocommerce widgets. This is an example of an online store in Oct. 2020.
 
 
Rauwe inhoud:
 
  
 +
Raw content:
 
<pre>
 
<pre>
 
a:14:{i:4;a:4:{s:5:"title";s:21:"Originele priri";s:9:"attribute";s:21:"originele priri";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:5;a:4:{s:5:"title";s:12:"Breedte (mm)";s:9:"attribute";s:7:"breedte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:6;a:4:{s:5:"title";s:11:"Diepte (mm)";s:9:"attribute";s:6:"diepte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:7;a:4:{s:5:"title";s:11:"Lengte (mm)";s:9:"attribute";s:6:"lengte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:8;a:4:{s:5:"title";s:18:"Automatische stop?";s:9:"attribute";s:17:"automatische_stop";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:9;a:4:{s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:10;a:4:{s:5:"title";s:10:"Connector?";s:9:"attribute";s:9:"connector";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:11;a:4:{s:5:"title";s:5:"Veer?";s:9:"attribute";s:4:"veer";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:12;a:4:{s:5:"title";s:7:"Diagram";s:9:"attribute";s:7:"diagram";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:13;a:4:{s:5:"title";s:19:"Ztoepassing - Model";s:9:"attribute";s:16:"toepassing_model";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:14;a:4:{s:5:"title";s:21:"Toepassing - Submodel";s:9:"attribute";s:19:"toepassing_submodel";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:15;a:4:{s:5:"title";s:17:"Azijn-productcode";s:9:"attribute";s:10:"azijn_code";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:16;a:4:{s:5:"title";s:11:"Productlijn";s:9:"attribute";s:11:"productlijn";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}s:12:"_multiwidget";i:1;}
 
a:14:{i:4;a:4:{s:5:"title";s:21:"Originele priri";s:9:"attribute";s:21:"originele priri";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:5;a:4:{s:5:"title";s:12:"Breedte (mm)";s:9:"attribute";s:7:"breedte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:6;a:4:{s:5:"title";s:11:"Diepte (mm)";s:9:"attribute";s:6:"diepte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:7;a:4:{s:5:"title";s:11:"Lengte (mm)";s:9:"attribute";s:6:"lengte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:8;a:4:{s:5:"title";s:18:"Automatische stop?";s:9:"attribute";s:17:"automatische_stop";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:9;a:4:{s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:10;a:4:{s:5:"title";s:10:"Connector?";s:9:"attribute";s:9:"connector";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:11;a:4:{s:5:"title";s:5:"Veer?";s:9:"attribute";s:4:"veer";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:12;a:4:{s:5:"title";s:7:"Diagram";s:9:"attribute";s:7:"diagram";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:13;a:4:{s:5:"title";s:19:"Ztoepassing - Model";s:9:"attribute";s:16:"toepassing_model";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:14;a:4:{s:5:"title";s:21:"Toepassing - Submodel";s:9:"attribute";s:19:"toepassing_submodel";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:15;a:4:{s:5:"title";s:17:"Azijn-productcode";s:9:"attribute";s:10:"azijn_code";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:16;a:4:{s:5:"title";s:11:"Productlijn";s:9:"attribute";s:11:"productlijn";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}s:12:"_multiwidget";i:1;}
 
</pre>
 
</pre>
  
De eerste paar regels in meer leesbare vorm:
+
The first few lines in more readable form:
  
 
<pre>
 
<pre>
Regel 49: Regel 48:
 
s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";
 
s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";
 
}
 
}
 +
}
 +
</pre>
 +
 +
== The problem with serialised data ==
 +
 +
Working with serialised data in WordPress poses a specific problem: Prior to each string, its length in bytes in UTF8 (or something like that), is specified. If this number is incorrect, WordPress will refuse to process the entire field.
 +
 +
* This means that it becomes quite difficult to process serialised data other than through appropriate API-calls. So no messing around directly in MySQL - Where otherwise this would be the only obvious solution.
 +
* But it gets worse: you cannot fix a messed-up serial field through WP-CLI functions that otherwise know how to handle serialised data. In order to fix messed-up serial fields. you probably have to recreate the whole object on a higher abstraction level (e.g.: Import a page afresh).
 +
 +
=== The problem - Example ===
 +
 +
On a site, page 51873 contains buttonbars. I can still retrieve all options for this page (including all buttonbars, implemented as ACF fields) with <code>wp post meta list</code>:
 +
 +
<pre>
 +
wp post meta list 51873
 +
 +
| post_id | meta_key                                    | meta_value                                                                                                                                     
 +
+---------+---------------------------------------------+----------------------------------------------+
 +
| 51873  | buttonbar_selection_0_image                | 53521                                                                                                                                               
 +
| 51873  | _buttonbar_selection_0_image                | field_5e450106734a3                                                                                                                                 
 +
| 51873  | buttonbar_selection_0_link                  | a:3:{s:5:"title";s:0:"";s:3:"url";s:36:"https://example-canada.com/shop/blub/";s:6:"target";s:0:"";}                                         
 +
| 51873  | _buttonbar_selection_0_link                | field_5e450111734a4
 +
</pre>
 +
 +
Note that field <code>buttonbar_selection_0_link</code> contains serialised data. The problem: The string that is specified to be 36 characters/bytes/whatever, isn't that number. Hence you can't fetch the field using <code>wp post meta get</code>, eventhough the other fields around it, are still retrievable:
 +
 +
<pre>
 +
wp post meta get 51873 buttonbar_selection_0_title
 +
wp post meta get 51873 buttonbar_selection_0_image
 +
wp post meta get 51873 buttonbar_selection_0_link
 +
</pre>
 +
 +
Output:
 +
 +
<pre>
 +
Widget
 +
53521
 +
</pre>
 +
 +
=== You can't directly update the compromised field ===
 +
 +
Looks promising, doesn't it?
 +
 +
<pre>
 +
wp post meta list 51873
 +
wp post meta get 51873 buttonbar_selection_0_title
 +
wp post meta get 51873 buttonbar_selection_0_image
 +
wp post meta get 51873 buttonbar_selection_0_link
 +
 +
wp post meta pluck 51873 buttonbar_selection_0_link title
 +
wp post meta pluck 51873 buttonbar_selection_0_link url
 +
wp post meta pluck 51873 buttonbar_selection_0_link blub
 +
 +
wp post meta patch update 51873 buttonbar_selection_0_link url "https://example.com"
 +
 +
wp post meta list 51873
 +
wp post meta get 51873 buttonbar_selection_0_title
 +
wp post meta get 51873 buttonbar_selection_0_image
 +
wp post meta get 51873 buttonbar_selection_0_link
 +
</pre>
 +
 +
But it's unforgiving. Even the non-compromised part ''title'' cannot be updated. The reading-part does work for a similar site without this problem (I didn't check writing there):
 +
 +
<pre>
 +
...
 +
Error: No data exists for key "title"
 +
Error: No data exists for key "url"
 +
...
 +
</pre>
 +
 +
=== You can't insert into the compromised field ===
 +
 +
Pfff:
 +
 +
<pre>
 +
wp post meta list 51873 | grep buttonbar_selection_0_link
 +
 +
 +
# Insert?
 +
#
 +
wp post meta patch insert 51873 buttonbar_selection_0_link title "Example.com"
 +
wp post meta patch insert 51873 buttonbar_selection_0_link url "https://example.com"
 +
</pre>
 +
 +
Output:
 +
 +
<pre>
 +
Error: Cannot create key "title" on data type boolean
 +
Error: Cannot create key "url" on data type boolean
 +
</pre>
 +
 +
=== You can't delete the subfields ===
 +
 +
By now, this was to be expected:
 +
 +
<pre>
 +
wp post meta patch delete 51873 buttonbar_selection_0_link title
 +
wp post meta patch delete 51873 buttonbar_selection_0_link url
 +
</pre>
 +
 +
with output:
 +
 +
<pre>
 +
Error: No data exists for key "title"
 +
Error: No data exists for key "url"
 +
</pre>
 +
 +
=== You can delete the field completely ===
 +
 +
Yep:
 +
 +
<pre>
 +
wp post meta delete 51873 buttonbar_selection_0_link
 +
wp post meta list 51873 | grep buttonbar_selection_0_
 +
</pre>
 +
 +
Note that the field name including underscore (<code>_buttonbar_selection_0_link</code>), still exists, but that the version without underscore (<code>buttonbar_selection_0_link</code>) is gone:
 +
 +
<pre>
 +
51873 buttonbar_selection_0_title Widget
 +
51873 _buttonbar_selection_0_title field_5e450082734a2
 +
51873 buttonbar_selection_0_image 53521
 +
51873 _buttonbar_selection_0_image field_5e450106734a3
 +
51873 _buttonbar_selection_0_link field_5e450111734a4
 +
</pre>
 +
 +
=== How to recreate? ===
 +
 +
Now that the field is completely gone, how to recreate it? This doesn't seem to work:
 +
 +
<pre>
 +
wp post meta patch insert 51873 buttonbar_selection_0_link title "Blub"
 +
wp post meta patch insert 51873 buttonbar_selection_0_link url "https://example.com"
 +
wp post meta patch insert 51873 buttonbar_selection_0_link target ""
 +
 +
wp post meta list 51873 | grep buttonbar_selection_0_
 +
</pre>
 +
 +
Return:
 +
 +
<pre>
 +
Error: Cannot create key "title" on data type string
 +
Error: Cannot create key "url" on data type string
 +
Error: Cannot create key "target" on data type string
 +
</pre>
 +
 +
== Tools to fix broken serialized data - Inventory ==
 +
 +
There is hope:
 +
 +
* https://btb.works/how-to-fix-a-broken-serialization/ ← I used this one. See elsewhere
 +
* https://stackoverflow.com/questions/15138893/fix-serialized-data-broken-due-to-editing-mysql-database-in-a-text-editor
 +
* https://stackoverflow.com/questions/3148712/regex-code-to-fix-corrupt-serialized-php-data/55074706#55074706
 +
* https://github.com/Blogestudio/Fix-Serialization
 +
* https://github.com/wsizoo/wordpress-fix-serialization
 +
 +
== fix-serialization.php ==
 +
 +
Around 2021 & 2022, I have used the ''fix-serializatin.php'' script from https://btb.works/how-to-fix-a-broken-serialization/. It works very well, except for a critical bug: <code>$content</code> should be <code>$data</code>. Otherwise, an empty file is written. My version of the script:
 +
 +
<pre>
 +
<?php
 +
#
 +
# Fix serial data string length errors
 +
################################################################################
 +
#
 +
# @author: Wojtek Szałkiewicz - wojtek  at  bracketspace.com
 +
# @license: GPL version 3 or later - http://www.gnu.org/licenses/gpl.txt
 +
# Based on Pau Iglesias http://blogestudio.com package
 +
#
 +
#
 +
# Usage
 +
########################################
 +
#
 +
# * php fix-serialization.php my-sql-file.sql
 +
# * php fix-serialization.php my-input-sql-file.sql my-output-sql-file.sql
 +
#
 +
#
 +
# Known errors
 +
########################################
 +
#
 +
# * Memory size exhausted: Allowed memory size of 67108864 bytes exhausted
 +
#  (tried to allocate 35266489 bytes) How to fix: update php.ini
 +
#  memory_limit to 512M or more, and restart cgi service or webserver
 +
# * Function preg_replace returns null or 0 length string If preg_last_error =
 +
#  PREG_BACKTRACK_LIMIT_ERROR (value 2), increase pcre.backtrack_limit in
 +
#  php.ini (by default 100k, change to 2M by example) Same way for others
 +
#  preg_last_error codes:
 +
#  http://www.php.net/manual/en/#unction.preg-last-error.php
 +
#
 +
#
 +
################################################################################
 +
# unescape_mysql()
 +
################################################################################
 +
#
 +
function unescape_mysql( $value )
 +
{
 +
#
 +
# * Remove redundant MySQL database dump escape codes
 +
# * Used in main code below
 +
#
 +
return str_replace(
 +
[ "\\\\", "\\0", "\\n", "\\r", "\Z",  "\'", '\"' ],
 +
[ "\\",  "\0",  "\n",  "\r",  "\x1a", "'", '"' ],
 +
$value
 +
);
 +
}
 +
 +
 +
################################################################################
 +
# unescape_quotes()
 +
################################################################################
 +
#
 +
function unescape_quotes( $value )
 +
{
 +
# * Remove escape character "\" before a double quote
 +
# * Used in main code below
 +
#
 +
return str_replace('\"', '"', $value);
 +
}
 +
 +
 +
################################################################################
 +
# Initialisation
 +
################################################################################
 +
#
 +
# * Read file into $data
 +
# * Reset counters
 +
#
 +
#
 +
# Check for presence of input file name
 +
########################################
 +
#
 +
if ( ! isset( $argv ) || ! isset( $argv[1] ) )
 +
{
 +
#
 +
echo "Error: no input file specified\n\n";
 +
exit;
 +
#
 +
}
 +
 +
 +
# Store filename in "$input
 +
########################################
 +
#
 +
$input = $argv[1];
 +
 +
 +
# Check for existence of input file
 +
########################################
 +
#
 +
if ( ! file_exists( $input ) )
 +
{
 +
#
 +
echo "Error: input file does not exists\n";
 +
echo "{$input}\n\n";
 +
exit;
 +
#
 +
}
 +
 +
 +
# Store file contents in "$data"
 +
########################################
 +
#
 +
$data = file_get_contents( $input );
 +
 +
 +
# Check for succesful file reading
 +
########################################
 +
#
 +
if ( ! $data )
 +
{
 +
#
 +
echo "Error: can`t read data from input file\n";
 +
echo "{$input}\n\n";
 +
exit;
 +
}
 +
 +
 +
# Check that file has content
 +
########################################
 +
#
 +
if ( ! isset( $data ) || ! ( strlen( $data ) > 0 ) )
 +
{
 +
#
 +
echo "Warning: the file is empty or can't read contents\n";
 +
echo "{$input}\n\n";
 +
exit;
 +
#
 +
}
 +
 +
 +
# Set counters to 0
 +
########################################
 +
#
 +
$processed = 0;
 +
$fixed    = 0;
 +
 +
 +
################################################################################
 +
# Main code: Search & replace
 +
################################################################################
 +
#
 +
$data = preg_replace_callback('!s:(\d+):([\\\\]?"[\\\\]?"|[\\\\]?"((.*?)[^\\\\])[\\\\]?");!', function( $matches ) use ( &$processed, &$fixed )
 +
{
 +
#
 +
# * preg_replace_callback — Perform a regular expression search and replace
 +
#  using a callback
 +
#  https://www.php.net/manual/en/function.preg-replace-callback.php
 +
# * This seems to be the main function. It seems to work recursively or
 +
#  something.
 +
#  Quite impressive to see the input argument
 +
# * The data is continuously stored in "$data"
 +
#
 +
#
 +
# Found a string statement?
 +
########################################
 +
#
 +
if ( isset( $matches[3] ) )
 +
{
 +
#
 +
# Recalculate string length
 +
########################################
 +
#
 +
$len = strlen( unescape_mysql( $matches[3] ) );
 +
$str = unescape_quotes( $matches[3] );
 +
$result = "s:{$len}:\"$str\";";
 +
 +
 +
# Do old and new string length differ?
 +
########################################
 +
#
 +
if ( $result !== $matches[0] )
 +
{
 +
# Yes: Update fix counter
 +
########################################
 +
#
 +
$fixed++;
 +
}
 +
}
 +
 +
# No string statement found (?)
 +
########################################
 +
#
 +
else
 +
{
 +
#
 +
# Just return the unchanged argument
 +
########################################
 +
#
 +
$result = $matches[0];
 +
}
 +
 +
 +
# Increass string counter
 +
########################################
 +
#
 +
$processed++;
 +
 +
return $result;
 +
 +
}, $data);
 +
 +
 +
################################################################################
 +
# Check results from "preg_replace"
 +
################################################################################
 +
#
 +
if ( ! ( isset( $data ) && strlen( $data ) > 0 ) )
 +
{
 +
// Error
 +
echo "Error: preg_replace returns nothing\n";
 +
 +
if ( function_exists( 'preg_last_error' ) ) {
 +
$last_error = preg_last_error();
 +
echo "preg_last_error() = {$last_error}\n";
 +
}
 +
 +
echo "{$input}\n\n";
 +
exit;
 +
}
 +
 +
 +
################################################################################
 +
# Write to old or new file?
 +
################################################################################
 +
#
 +
if ( isset( $argv[2] ) )
 +
{
 +
  $output = $argv[2];
 +
  # echo "Output file name has been provided: {$argv[2]}\n";
 +
} else
 +
{
 +
  $output = $input;
 +
  # echo "No output file name has been provided - Overwrite input file\n";
 +
}
 +
 +
 +
################################################################################
 +
# Write to file
 +
################################################################################
 +
#
 +
# * file_put_contents($filename, $data) is a standard PHP function
 +
# * Original first line: $result = file_put_contents( $output, $content );
 +
#  It seems that "$content" should have been "$data". Otherwise, an empty file
 +
#  is written
 +
#
 +
#
 +
$result = file_put_contents( $output, $data );
 +
// Write file data
 +
if ( false === $result ) {
 +
// Error
 +
echo "Error: can't write fixed content\n";
 +
echo "{$output}\n\n";
 +
exit;
 +
}
 +
 +
 +
################################################################################
 +
# Exit
 +
################################################################################
 +
#
 +
echo "Found strings: {$processed}\n";
 +
echo "Fixed: {$fixed}\n";
 +
 +
</pre>
 +
 +
== fix-serialization.php - Example ==
 +
 +
I use this self-developed Bash function <code>tourlib-general-settings.sh » fix_corrupt_serialised_data()</code> to automatically fix all serialised data errors in a database:
 +
 +
<pre>
 +
################################################################################
 +
# fix_corrupt_serialised_data()
 +
################################################################################
 +
#
 +
fix_corrupt_serialised_data()
 +
{
 +
#
 +
# * Create db dump, fix database, import again
 +
# * Moved from tourlib-de-buttonbar-pages.sh - 2022.11.26
 +
# * This function doesn't make a database backup prior to its operation.
 +
#  It is highly recommended to do so. Use the existing function(s) to do
 +
#  so, e.g., tourlib-instantiate-site.sh » backup_db_to_folder_dvb()
 +
#
 +
#
 +
# Initialisation
 +
########################################
 +
#
 +
echo ""; echo ""; echo "### fix_corrupt_serialised_data..."; echo ""
 +
if [ -z "$project_path" ]; then echo " Variable 'project_path' not provided. Exiting"; exit; fi
 +
if [ -z "$db_name" ]; then echo " Variable 'db_name' not provided. Exiting"; exit; fi
 +
if [ -z "$path" ]; then echo " Variable 'path' not provided. Exiting"; exit; fi
 +
 +
 +
# cd to project location
 +
#######################################
 +
#
 +
cd $project_path
 +
 +
 +
# Create time stamp
 +
#######################################
 +
#
 +
time_stamp=$(date +%Y.%m.%d-%H.%M)
 +
 +
 +
# Create db dump
 +
#######################################
 +
#
 +
echo "Create db dump (takes about 20s)..."
 +
#
 +
mysqldump $db_name > ${db_name}-${time_stamp}-org.sql
 +
 +
 +
# Fix db
 +
#######################################
 +
#
 +
echo "Fix db..."
 +
php ~/scripts/fix-serialization.php ${db_name}-${time_stamp}-org.sql ${db_name}-${time_stamp}-fixed.sql
 +
 +
 +
# Import db
 +
#######################################
 +
#
 +
echo "Import db again (takes about 70s)..."
 +
mysql $db_name < ${db_name}-${time_stamp}-fixed.sql
 +
 +
 +
# Delete transients
 +
#######################################
 +
#
 +
cd $path
 +
wp transient delete --all
 +
#
 
}
 
}
 
</pre>
 
</pre>
Regel 54: Regel 549:
 
== PHP: Serialize & unserialize ==
 
== PHP: Serialize & unserialize ==
  
WordPress gebruikt de PHP-functie <code>serialize()</code> om data te ''serializen'' voor opslag in de database en de functie <code>unserialize</code> voor de tegenovergestelde bewerking.  
+
WordPress uses the PHP function <code>serialize()</code> to ''serialize'' data for storage in the database and the <code>unserialize</code> function for the opposite operation.
 
 
Voorbeeld [https://stackoverflow.com/questions/13319638/working-with-serialized-data-in-wordpress]:
 
  
 +
Example [https://stackoverflow.com/questions/13319638/working-with-serialized-data-in-wordpress]:
 
<pre>
 
<pre>
 
$serialized = 'a:3:{i:0;s:5:"apple";i:1;s:6:"banana";i:2;s:6:"orange";}';
 
$serialized = 'a:3:{i:0;s:5:"apple";i:1;s:6:"banana";i:2;s:6:"orange";}';
Regel 71: Regel 565:
 
)
 
)
 
</pre>
 
</pre>
 +
 +
== WP-CLI ==
 +
 +
[[Hoofdpagina#WP-CLI | WP-CLI]] can handle serialised data very well, as long as it is about ''specific fields''. E.g.: The ''positional parameters'' <code>pluck</code> and <code>patch</code> are for reading and writing parts of serialised data.
 +
 +
However, there are no generic WP-CLI commands for serializing and unserializing.
 +
 +
Maybe the aforementioned PHP commands can be invoked through <code>wp eval</code>, but so far (2023.11, after half an hour of trying) I didn't get it to work → [[Wp eval (WP-CLI)]].
  
 
== MySQL ==
 
== MySQL ==
  
MySQL lijkt geen ingebouwde functies te hebben die vergelijkbaar zijn met de PHP-functies ''serialize'' en ''unserialize''. Dit had overigens best gekund: MySQL heeft bv. wel functies om met XML om te gaan. Verder:
+
MySQL does not seem to have built-in functions similar to the PHP functions ''serialize'' and ''unserialize''. This would have been possible: MySQL, for example, does have functions to deal with XML.
 
 
* Tot op heden (2021.03.02) heb ik geen libraries of sprocs of scripts gevonden die dit kunnen [https://stackoverflow.com/questions/5335970/unserialize-through-query-at-database-level-itself]. En ook dat had best anders kunnen zijn: Zo ingewikkeld lijkt dit niet te zijn
 
* Ik heb een primitieve parser voor ''find-en-replace''. Zoiets als de ingebouwde functie <code>replace</code>, maar dan voor strings met serialised content met een beperkte vaste structuur
 
* MySQL kent ''recursie'' en ''limited variable scope'' - Concepten die vermoedelijk goed van pas komen als ik ooit een flexibelere parser wil bouwen.
 
  
 
== Syntaxis ==
 
== Syntaxis ==
Regel 132: Regel 630:
 
# * s:27:"Balais de charbon d'origine"
 
# * s:27:"Balais de charbon d'origine"
 
</pre>
 
</pre>
 +
 +
== See also ==
 +
 +
* [[Wp eval (WP-CLI)]]
  
 
== Bronnen ==
 
== Bronnen ==
Regel 139: Regel 641:
 
* https://stackoverflow.com/questions/8922668/updating-base64-encode-serialized-array-on-mysql-databases - unserialize, re-serialize
 
* https://stackoverflow.com/questions/8922668/updating-base64-encode-serialized-array-on-mysql-databases - unserialize, re-serialize
 
* https://stackoverflow.com/questions/5335970/unserialize-through-query-at-database-level-itself
 
* https://stackoverflow.com/questions/5335970/unserialize-through-query-at-database-level-itself
* https://stackoverflow.com/questions/62980734/deserialize-and-serialize-data-in-mysql - Same use case as me
+
* https://stackoverflow.com/questions/62980734/deserialize-and-serialize-data-in-mysql - Same use case as me. I've posted my sproc here
 +
* https://wpengine.com/support/wordpress-serialized-data

Huidige versie van 27 nov 2023 om 18:07

Sometimes database fields contain serialized data: Multiple data in one field, concatenated using a certain syntax. Although the concept of serialized data seems quite at odds with some basic databases principles, there is an obvious application for: Storing objects.

However, compaired to e.g., YAML, this nameless protocol that WordPress uses, has one major issue: Before each string, the length of that string is specified. When you change the string but fail to update this string-count, the object is not readable anymore. See below for a solution.

Example

In table wp_options, field widget_woocommerce_layered_nav encodes all Woocommerce widgets. This is an example of an online store in Oct. 2020.

Raw content:

a:14:{i:4;a:4:{s:5:"title";s:21:"Originele priri";s:9:"attribute";s:21:"originele priri";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:5;a:4:{s:5:"title";s:12:"Breedte (mm)";s:9:"attribute";s:7:"breedte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:6;a:4:{s:5:"title";s:11:"Diepte (mm)";s:9:"attribute";s:6:"diepte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:7;a:4:{s:5:"title";s:11:"Lengte (mm)";s:9:"attribute";s:6:"lengte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:8;a:4:{s:5:"title";s:18:"Automatische stop?";s:9:"attribute";s:17:"automatische_stop";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:9;a:4:{s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:10;a:4:{s:5:"title";s:10:"Connector?";s:9:"attribute";s:9:"connector";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:11;a:4:{s:5:"title";s:5:"Veer?";s:9:"attribute";s:4:"veer";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:12;a:4:{s:5:"title";s:7:"Diagram";s:9:"attribute";s:7:"diagram";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:13;a:4:{s:5:"title";s:19:"Ztoepassing - Model";s:9:"attribute";s:16:"toepassing_model";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:14;a:4:{s:5:"title";s:21:"Toepassing - Submodel";s:9:"attribute";s:19:"toepassing_submodel";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:15;a:4:{s:5:"title";s:17:"Azijn-productcode";s:9:"attribute";s:10:"azijn_code";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:16;a:4:{s:5:"title";s:11:"Productlijn";s:9:"attribute";s:11:"productlijn";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}s:12:"_multiwidget";i:1;}

The first few lines in more readable form:

a:14:
{
	i:4;a:4:
	{
		s:5:"title";
		s:21:"Originele pripripripr";
		s:9:"attribute";
		s:21:"originele pripripripr";
		s:12:"display_type";
		s:4:"list";
		s:10:"query_type";
		s:3:"and";
	}
	i:5;a:4:
	{
		s:5:"title";s:12:"Breedte (mm)";s:9:"attribute";s:7:"breedte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";
	}
	i:6;a:4:
	{
		s:5:"title";s:11:"Diepte (mm)";s:9:"attribute";s:6:"diepte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";
	}
	i:7;a:4:
	{
		s:5:"title";s:11:"Lengte (mm)";s:9:"attribute";s:6:"lengte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";
	}
	i:8;a:4:
	{
		s:5:"title";s:18:"Automatische stop?";s:9:"attribute";s:17:"automatische_stop";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";
	}
	i:9;a:4:
	{
		s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";
	}
}

The problem with serialised data

Working with serialised data in WordPress poses a specific problem: Prior to each string, its length in bytes in UTF8 (or something like that), is specified. If this number is incorrect, WordPress will refuse to process the entire field.

  • This means that it becomes quite difficult to process serialised data other than through appropriate API-calls. So no messing around directly in MySQL - Where otherwise this would be the only obvious solution.
  • But it gets worse: you cannot fix a messed-up serial field through WP-CLI functions that otherwise know how to handle serialised data. In order to fix messed-up serial fields. you probably have to recreate the whole object on a higher abstraction level (e.g.: Import a page afresh).

The problem - Example

On a site, page 51873 contains buttonbars. I can still retrieve all options for this page (including all buttonbars, implemented as ACF fields) with wp post meta list:

wp post meta list 51873

| post_id | meta_key                                    | meta_value                                                                                                                                       
+---------+---------------------------------------------+----------------------------------------------+
| 51873   | buttonbar_selection_0_image                 | 53521                                                                                                                                                
| 51873   | _buttonbar_selection_0_image                | field_5e450106734a3                                                                                                                                  
| 51873   | buttonbar_selection_0_link                  | a:3:{s:5:"title";s:0:"";s:3:"url";s:36:"https://example-canada.com/shop/blub/";s:6:"target";s:0:"";}                                          
| 51873   | _buttonbar_selection_0_link                 | field_5e450111734a4

Note that field buttonbar_selection_0_link contains serialised data. The problem: The string that is specified to be 36 characters/bytes/whatever, isn't that number. Hence you can't fetch the field using wp post meta get, eventhough the other fields around it, are still retrievable:

wp post meta get 51873 buttonbar_selection_0_title
wp post meta get 51873 buttonbar_selection_0_image
wp post meta get 51873 buttonbar_selection_0_link

Output:

Widget
53521

You can't directly update the compromised field

Looks promising, doesn't it?

wp post meta list 51873
wp post meta get 51873 buttonbar_selection_0_title
wp post meta get 51873 buttonbar_selection_0_image
wp post meta get 51873 buttonbar_selection_0_link

wp post meta pluck 51873 buttonbar_selection_0_link title
wp post meta pluck 51873 buttonbar_selection_0_link url
wp post meta pluck 51873 buttonbar_selection_0_link blub

wp post meta patch update 51873 buttonbar_selection_0_link url "https://example.com"

wp post meta list 51873
wp post meta get 51873 buttonbar_selection_0_title
wp post meta get 51873 buttonbar_selection_0_image
wp post meta get 51873 buttonbar_selection_0_link

But it's unforgiving. Even the non-compromised part title cannot be updated. The reading-part does work for a similar site without this problem (I didn't check writing there):

...
Error: No data exists for key "title"
Error: No data exists for key "url"
...

You can't insert into the compromised field

Pfff:

wp post meta list 51873 | grep buttonbar_selection_0_link


# Insert?
#
wp post meta patch insert 51873 buttonbar_selection_0_link title "Example.com"
wp post meta patch insert 51873 buttonbar_selection_0_link url "https://example.com"

Output:

Error: Cannot create key "title" on data type boolean
Error: Cannot create key "url" on data type boolean

You can't delete the subfields

By now, this was to be expected:

wp post meta patch delete 51873 buttonbar_selection_0_link title
wp post meta patch delete 51873 buttonbar_selection_0_link url

with output:

Error: No data exists for key "title"
Error: No data exists for key "url"

You can delete the field completely

Yep:

wp post meta delete 51873 buttonbar_selection_0_link
wp post meta list 51873 | grep buttonbar_selection_0_

Note that the field name including underscore (_buttonbar_selection_0_link), still exists, but that the version without underscore (buttonbar_selection_0_link) is gone:

51873	buttonbar_selection_0_title	Widget
51873	_buttonbar_selection_0_title	field_5e450082734a2
51873	buttonbar_selection_0_image	53521
51873	_buttonbar_selection_0_image	field_5e450106734a3
51873	_buttonbar_selection_0_link	field_5e450111734a4

How to recreate?

Now that the field is completely gone, how to recreate it? This doesn't seem to work:

wp post meta patch insert 51873 buttonbar_selection_0_link title "Blub"
wp post meta patch insert 51873 buttonbar_selection_0_link url "https://example.com"
wp post meta patch insert 51873 buttonbar_selection_0_link target ""

wp post meta list 51873 | grep buttonbar_selection_0_

Return:

Error: Cannot create key "title" on data type string
Error: Cannot create key "url" on data type string
Error: Cannot create key "target" on data type string

Tools to fix broken serialized data - Inventory

There is hope:

fix-serialization.php

Around 2021 & 2022, I have used the fix-serializatin.php script from https://btb.works/how-to-fix-a-broken-serialization/. It works very well, except for a critical bug: $content should be $data. Otherwise, an empty file is written. My version of the script:

<?php
#
# Fix serial data string length errors
################################################################################
# 
# @author: Wojtek Szałkiewicz - wojtek  at  bracketspace.com
# @license: GPL version 3 or later - http://www.gnu.org/licenses/gpl.txt
# Based on Pau Iglesias http://blogestudio.com package
#
#
# Usage
########################################
#
# * php fix-serialization.php my-sql-file.sql
# * php fix-serialization.php my-input-sql-file.sql my-output-sql-file.sql
#
#
# Known errors
########################################
#
# * Memory size exhausted: Allowed memory size of 67108864 bytes exhausted
#   (tried to allocate 35266489 bytes) How to fix: update php.ini 
#   memory_limit to 512M or more, and restart cgi service or webserver
# * Function preg_replace returns null or 0 length string If preg_last_error =
#   PREG_BACKTRACK_LIMIT_ERROR (value 2), increase pcre.backtrack_limit in 
#   php.ini (by default 100k, change to 2M by example) Same way for others
#   preg_last_error codes: 
#   http://www.php.net/manual/en/#unction.preg-last-error.php
#
#
################################################################################
# unescape_mysql()
################################################################################
#
function unescape_mysql( $value )
{
	#
	# * Remove redundant MySQL database dump escape codes
	# * Used in main code below
	#
	return str_replace(
		[ "\\\\", "\\0", "\\n", "\\r", "\Z",  "\'", '\"' ],
		[ "\\",   "\0",  "\n",  "\r",  "\x1a", "'", '"' ],
		$value
	);
}


################################################################################
# unescape_quotes()
################################################################################
#
function unescape_quotes( $value )
{
	# * Remove escape character "\" before a double quote
	# * Used in main code below
	#
	return str_replace('\"', '"', $value);
}


################################################################################
# Initialisation
################################################################################
#
# * Read file into $data
# * Reset counters
#
#
# Check for presence of input file name
########################################
#
if ( ! isset( $argv ) || ! isset( $argv[1] ) )
{
	#
	echo "Error: no input file specified\n\n";
	exit;
	#
}


# Store filename in "$input
########################################
#
$input = $argv[1];


# Check for existence of input file
########################################
#
if ( ! file_exists( $input ) )
{
	#
	echo "Error: input file does not exists\n";
	echo "{$input}\n\n";
	exit;
	#
}


# Store file contents in "$data"
########################################
#
$data = file_get_contents( $input );


# Check for succesful file reading
########################################
#
if ( ! $data ) 
{
	#
	echo "Error: can`t read data from input file\n";
	echo "{$input}\n\n";
	exit;
}


# Check that file has content
########################################
#
if ( ! isset( $data ) || ! ( strlen( $data ) > 0 ) )
{
	#	
	echo "Warning: the file is empty or can't read contents\n";
	echo "{$input}\n\n";
	exit;
	#
}


# Set counters to 0
########################################
#
$processed = 0;
$fixed     = 0;


################################################################################
# Main code: Search & replace
################################################################################
#
$data = preg_replace_callback('!s:(\d+):([\\\\]?"[\\\\]?"|[\\\\]?"((.*?)[^\\\\])[\\\\]?");!', function( $matches ) use ( &$processed, &$fixed )
{
	#
	# * preg_replace_callback — Perform a regular expression search and replace 
	#   using a callback
	#   https://www.php.net/manual/en/function.preg-replace-callback.php
	# * This seems to be the main function. It seems to work recursively or
	#   something.
	#   Quite impressive to see the input argument
	# * The data is continuously stored in "$data"
	#
	#
	# Found a string statement?
	########################################
	#
	if ( isset( $matches[3] ) )
	{
		#
		# Recalculate string length
		########################################
		#
		$len = strlen( unescape_mysql( $matches[3] ) );
		$str = unescape_quotes( $matches[3] );
		$result = "s:{$len}:\"$str\";";


		# Do old and new string length differ?
		########################################
		#
		if ( $result !== $matches[0] )
		{
			# Yes: Update fix counter
			########################################
			#
			$fixed++;
		}
	}

	# No string statement found (?)
	########################################
	#
	else 
	{
		#
		# Just return the unchanged argument
		########################################
		#
		$result = $matches[0];
	}


	# Increass string counter
	########################################
	#
	$processed++;

	return $result;
	
}, $data);


################################################################################
# Check results from "preg_replace"
################################################################################
#
if ( ! ( isset( $data ) && strlen( $data ) > 0 ) )
{
	// Error
	echo "Error: preg_replace returns nothing\n";

	if ( function_exists( 'preg_last_error' ) ) {
		$last_error = preg_last_error();
		echo "preg_last_error() = {$last_error}\n";
	}

	echo "{$input}\n\n";
	exit;
}


################################################################################
# Write to old or new file?
################################################################################
#
if ( isset( $argv[2] ) )
{
  $output = $argv[2];
  # echo "Output file name has been provided: {$argv[2]}\n";
} else
{
  $output = $input;
  # echo "No output file name has been provided - Overwrite input file\n";
}


################################################################################
# Write to file
################################################################################
#
# * file_put_contents($filename, $data) is a standard PHP function
# * Original first line: $result = file_put_contents( $output, $content );
#   It seems that "$content" should have been "$data". Otherwise, an empty file
#   is written
#
#
$result = file_put_contents( $output, $data );
// Write file data
if ( false === $result ) {
	// Error
	echo "Error: can't write fixed content\n";
	echo "{$output}\n\n";
	exit;
}


################################################################################
# Exit
################################################################################
#
echo "Found strings: {$processed}\n";
echo "Fixed: {$fixed}\n";

fix-serialization.php - Example

I use this self-developed Bash function tourlib-general-settings.sh » fix_corrupt_serialised_data() to automatically fix all serialised data errors in a database:

################################################################################
# fix_corrupt_serialised_data()
################################################################################
#
fix_corrupt_serialised_data()
{
	#
	# * Create db dump, fix database, import again
	# * Moved from tourlib-de-buttonbar-pages.sh - 2022.11.26
	# * This function doesn't make a database backup prior to its operation.
	#   It is highly recommended to do so. Use the existing function(s) to do
	#   so, e.g., tourlib-instantiate-site.sh » backup_db_to_folder_dvb()
	#
	#
	# Initialisation
	########################################
	#
	echo ""; echo ""; echo "### fix_corrupt_serialised_data..."; echo ""
	if [ -z "$project_path" ]; then echo "	Variable 'project_path' not provided. Exiting"; exit; fi
	if [ -z "$db_name" ]; then 			echo "	Variable 'db_name' not provided. Exiting"; exit; fi
	if [ -z "$path" ]; then 			echo "	Variable 'path' not provided. Exiting"; exit; fi


	# cd to project location
	#######################################
	#
	cd $project_path


	# Create time stamp
	#######################################
	#
	time_stamp=$(date +%Y.%m.%d-%H.%M)


	# Create db dump
	#######################################
	#
	echo "Create db dump (takes about 20s)..."
	#
	mysqldump $db_name > ${db_name}-${time_stamp}-org.sql


	# Fix db
	#######################################
	#
	echo "Fix db..."
	php ~/scripts/fix-serialization.php ${db_name}-${time_stamp}-org.sql ${db_name}-${time_stamp}-fixed.sql


	# Import db
	#######################################
	#
	echo "Import db again (takes about 70s)..."
	mysql $db_name < ${db_name}-${time_stamp}-fixed.sql


	# Delete transients
	#######################################
	#
	cd $path
	wp transient delete --all
	#
}

PHP: Serialize & unserialize

WordPress uses the PHP function serialize() to serialize data for storage in the database and the unserialize function for the opposite operation.

Example [1]:

$serialized = 'a:3:{i:0;s:5:"apple";i:1;s:6:"banana";i:2;s:6:"orange";}';
var_dump(unserialize($serialized));

Output:

Array
(
    [0] => apple
    [1] => banana
    [2] => orange
)

WP-CLI

WP-CLI can handle serialised data very well, as long as it is about specific fields. E.g.: The positional parameters pluck and patch are for reading and writing parts of serialised data.

However, there are no generic WP-CLI commands for serializing and unserializing.

Maybe the aforementioned PHP commands can be invoked through wp eval, but so far (2023.11, after half an hour of trying) I didn't get it to work → Wp eval (WP-CLI).

MySQL

MySQL does not seem to have built-in functions similar to the PHP functions serialize and unserialize. This would have been possible: MySQL, for example, does have functions to deal with XML.

Syntaxis

Dankzij het voorbeeld hierboven, kun je al een deel van de syntaxis achterhalen:

  • Data-elementen zijn omhuld door { en }
  • a: Array
  • i: Counter; index
  • s: String + lengte. Bv. s:5"apple".

Stringlengte

Strings worden voorafgegaan door de lengte van de betreffende string, gemeten in UTF8-bytes. Als dit getal niet klopt, wordt de betreffende serialised data, niet verwerkt.

Voorbeelden:

# The string "0...9" is exactly 10 bytes long: Every character is exactly one byte
#
s:10:"0123456789"
# The letter "è" takes two bytes, hence the whole string is 7 bytes long, although only 4 characters long
#
s:7:"Bèèè"
concat('s:27:"Balais de charbon d', char(39), 'origine"')
#
# * Balais de charbon d	→ 19 characters & 19 bytes
# * char(39) 		→ 1 character & 1 byte
# * origine		→ 7 characters & 7 bytes
#
# * Total: 27 bytes

Enveloping symbols

  • Usually, you can choose to envelop a string within 'apostrophes' or within "quotes"
  • However, we have a complication here: The actual string already contains both apostrophes & quotes
  • The way to solve this: Use apostrophes for enveloping the string. When an apostrophe is needed inside the string, and use char(39) to represent the apostrophe inside the string
  • BTW: char(39) counts as one byte

Voorbeeld:

concat('s:27:"Balais de charbon d', char(39), 'origine"')
#
# * The string (inside the apostrophes) becomes
# * s:27:"Balais de charbon d'origine"

See also

Bronnen