Serialised data (WordPress)
Sometimes database fields contain serialized data: Multiple data in one field, which are concatenated using a certain syntax.
Although the concept of serialized data seems quite at odds with some basic databases principles, it offers flexibility that otherwise would be hard to get by. In addition, this is actually used to store Object-oriented objects - I hadn't thought of that before!
Example
In table wp_options
, field widget_woocommerce_layered_nav
encodes all Woocommerce widgets. This is an example of an online store in Oct. 2020.
Raw content:
a:14:{i:4;a:4:{s:5:"title";s:21:"Originele priri";s:9:"attribute";s:21:"originele priri";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:5;a:4:{s:5:"title";s:12:"Breedte (mm)";s:9:"attribute";s:7:"breedte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:6;a:4:{s:5:"title";s:11:"Diepte (mm)";s:9:"attribute";s:6:"diepte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:7;a:4:{s:5:"title";s:11:"Lengte (mm)";s:9:"attribute";s:6:"lengte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:8;a:4:{s:5:"title";s:18:"Automatische stop?";s:9:"attribute";s:17:"automatische_stop";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:9;a:4:{s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:10;a:4:{s:5:"title";s:10:"Connector?";s:9:"attribute";s:9:"connector";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:11;a:4:{s:5:"title";s:5:"Veer?";s:9:"attribute";s:4:"veer";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:12;a:4:{s:5:"title";s:7:"Diagram";s:9:"attribute";s:7:"diagram";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:13;a:4:{s:5:"title";s:19:"Ztoepassing - Model";s:9:"attribute";s:16:"toepassing_model";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:14;a:4:{s:5:"title";s:21:"Toepassing - Submodel";s:9:"attribute";s:19:"toepassing_submodel";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:15;a:4:{s:5:"title";s:17:"Azijn-productcode";s:9:"attribute";s:10:"azijn_code";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:16;a:4:{s:5:"title";s:11:"Productlijn";s:9:"attribute";s:11:"productlijn";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}s:12:"_multiwidget";i:1;}
The first few lines in more readable form:
a:14: { i:4;a:4: { s:5:"title"; s:21:"Originele pripripripr"; s:9:"attribute"; s:21:"originele pripripripr"; s:12:"display_type"; s:4:"list"; s:10:"query_type"; s:3:"and"; } i:5;a:4: { s:5:"title";s:12:"Breedte (mm)";s:9:"attribute";s:7:"breedte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:6;a:4: { s:5:"title";s:11:"Diepte (mm)";s:9:"attribute";s:6:"diepte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:7;a:4: { s:5:"title";s:11:"Lengte (mm)";s:9:"attribute";s:6:"lengte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:8;a:4: { s:5:"title";s:18:"Automatische stop?";s:9:"attribute";s:17:"automatische_stop";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:9;a:4: { s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } }
The problem with serialised data & WordPress
Working with serialised data in WordPress poses a specific problem: Prior to each string, its length in bytes in UTF8 (or something like that), is specified. If this number is incorrect, WordPress will refuse to process the entire field.
- This means that it becomes quite difficult to process serialised data other than through appropriate API-calls. So no messing around directly in MySQL - Where otherwise this would be the only obvious solution.
- But it gets worse: you cannot fix a messed-up serial field through WP-CLI functions that otherwise know how to handle serialised data. In order to fix messed-up serial fields. you probably have to recreate the whole object on a higher abstraction level (e.g.: Import a page afresh).
The problem - Example
On a site, page 51873 contains buttonbars. I can still retrieve all options for this page (including all buttonbars, implemented as ACF fields) with wp post meta list
:
wp post meta list 51873 | post_id | meta_key | meta_value +---------+---------------------------------------------+----------------------------------------------+ | 51873 | buttonbar_selection_0_image | 53521 | 51873 | _buttonbar_selection_0_image | field_5e450106734a3 | 51873 | buttonbar_selection_0_link | a:3:{s:5:"title";s:0:"";s:3:"url";s:36:"https://example-canada.com/shop/blub/";s:6:"target";s:0:"";} | 51873 | _buttonbar_selection_0_link | field_5e450111734a4
Note that field buttonbar_selection_0_link
contains serialised data. The problem: The string that is specified to be 36 characters/bytes/whatever, isn't that number. Hence you can't fetch the field using wp post meta get
, eventhough the other fields around it, are still retrievable:
wp post meta get 51873 buttonbar_selection_0_title wp post meta get 51873 buttonbar_selection_0_image wp post meta get 51873 buttonbar_selection_0_link
Output:
Widget 53521
You can't directly update the compromised field
Looks promising, doesn't it?
wp post meta list 51873 wp post meta get 51873 buttonbar_selection_0_title wp post meta get 51873 buttonbar_selection_0_image wp post meta get 51873 buttonbar_selection_0_link wp post meta pluck 51873 buttonbar_selection_0_link title wp post meta pluck 51873 buttonbar_selection_0_link url wp post meta pluck 51873 buttonbar_selection_0_link blub wp post meta patch update 51873 buttonbar_selection_0_link url "https://example.com" wp post meta list 51873 wp post meta get 51873 buttonbar_selection_0_title wp post meta get 51873 buttonbar_selection_0_image wp post meta get 51873 buttonbar_selection_0_link
But it's inforgiving. Even the non-compromised part title cannot be updated. The reading-part does work for a similar site without this problem (I didn't check writing there):
... Error: No data exists for key "title" Error: No data exists for key "url" ...
You can't insert into the compromised field
Pfff:
wp post meta list 51873 | grep buttonbar_selection_0_link # Insert? # wp post meta patch insert 51873 buttonbar_selection_0_link title "Example.com" wp post meta patch insert 51873 buttonbar_selection_0_link url "https://example.com"
Output:
Error: Cannot create key "title" on data type boolean Error: Cannot create key "url" on data type boolean
You can't delete the subfields
By now, this was to be expected:
wp post meta patch delete 51873 buttonbar_selection_0_link title wp post meta patch delete 51873 buttonbar_selection_0_link url
with output:
Error: No data exists for key "title" Error: No data exists for key "url"
You can delete the field completely
Yep:
wp post meta delete 51873 buttonbar_selection_0_link wp post meta list 51873 | grep buttonbar_selection_0_
Note that the field name including underscore (_buttonbar_selection_0_link
), still exists, but that the version without underscore (buttonbar_selection_0_link
) is gone:
51873 buttonbar_selection_0_title Widget 51873 _buttonbar_selection_0_title field_5e450082734a2 51873 buttonbar_selection_0_image 53521 51873 _buttonbar_selection_0_image field_5e450106734a3 51873 _buttonbar_selection_0_link field_5e450111734a4
PHP: Serialize & unserialize
WordPress uses the PHP function serialize()
to serialize data for storage in the database and the unserialize
function for the opposite operation.
Example [1]:
$serialized = 'a:3:{i:0;s:5:"apple";i:1;s:6:"banana";i:2;s:6:"orange";}'; var_dump(unserialize($serialized)); Output: Array ( [0] => apple [1] => banana [2] => orange )
WP-CLI
WP-CLI can handle serialised data very well. The positional parameters pluck
and patch
are for reading and writing parts of serialised data
MySQL
MySQL does not seem to have built-in functions similar to the PHP functions serialize and unserialize. This would have been possible: MySQL, for example, does have functions to deal with XML.
Syntaxis
Dankzij het voorbeeld hierboven, kun je al een deel van de syntaxis achterhalen:
- Data-elementen zijn omhuld door
{
en}
a
: Arrayi
: Counter; indexs
: String + lengte. Bv.s:5"apple"
.
Stringlengte
Strings worden voorafgegaan door de lengte van de betreffende string, gemeten in UTF8-bytes. Als dit getal niet klopt, wordt de betreffende serialised data, niet verwerkt.
Voorbeelden:
# The string "0...9" is exactly 10 bytes long: Every character is exactly one byte # s:10:"0123456789"
# The letter "è" takes two bytes, hence the whole string is 7 bytes long, although only 4 characters long # s:7:"Bèèè"
concat('s:27:"Balais de charbon d', char(39), 'origine"') # # * Balais de charbon d → 19 characters & 19 bytes # * char(39) → 1 character & 1 byte # * origine → 7 characters & 7 bytes # # * Total: 27 bytes
Enveloping symbols
- Usually, you can choose to envelop a string within 'apostrophes' or within "quotes"
- However, we have a complication here: The actual string already contains both apostrophes & quotes
- The way to solve this: Use apostrophes for enveloping the string. When an apostrophe is needed inside the string, and use char(39) to represent the apostrophe inside the string
- BTW:
char(39)
counts as one byte
Voorbeeld:
concat('s:27:"Balais de charbon d', char(39), 'origine"') # # * The string (inside the apostrophes) becomes # * s:27:"Balais de charbon d'origine"
MySQL - Sproc replace_serial
Dit was een leuk puzzeltje om te doen, maar het practische nut is helaas beperkt.
CREATE DEFINER=`strompf`@`localhost` PROCEDURE `replace_serial`( in table_name_in tinytext, # Source table name in col_serial_data_src_in tinytext, # Column with source serialised data in col_serial_data_dst_in tinytext, # Column with destination serialised data in col_src_in tinytext, # Column with source text strings in col_dst_in tinytext # Column with desination text string ) BEGIN # # Find-and-replace within serialised content ################################################################################ # # * This sproc "replace_serial" needs as input pointers to a table # ("pointer table") with the following columns: # * Column with the source serialised data # * Column for storing the updated serialised data # * Column with find-string # * Column with replace-strings # * It can be any table: The arguments for replace_serial() contains the name # of this table and the names of those four columns # * Internally, data from this input-table is copied to temporary table # "serial_tmp" # * The designated serialised data is decomposed into several fields (again, # in table "serial_tmp") - Currently, this sproc only works for serialised # data with 1 a-element, followed by 6 s-elements. This sproc has not been # tested with other configurations # * Find-and-replace is only performed on element s4 (=the 4th s-element) # * Subsequently, the new length of s4 is calculated, and the serialised string # is written to the designated column in the "pointer table" # * Subsequently, it's up to the programmer to process the resulting data in # the pointer table # * This approach with a "pointer table" works really well: Easy to grasp, # flexible and easy to debug. It's especially handy for debugging or further # processing that the output is in a separate column from the input. # * Copyright Jeroen Strompf - De Vliegende Brigade - 2021.03.09 # # # History ######################################## # # * 2021.03.09: Created, tested and pushed to "bal_dwh_org" # # ################################################################################ # Define working table "serial_tmp" ################################################################################ # drop table if exists serial_tmp; create temporary table serial_tmp ( serial_data_src varchar(255) comment "Actual serialised data", serial_data_dst varchar(255) comment "Resulting serialised data. This is a separate field, to make debugging easier", find_string varchar(255) comment "String to be found within serial_data_src", replace_string varchar(255) comment "Replacement string for within serial_data_src, resulting in serial_data_dst", a_len smallint, a_string varchar(255) comment "The string within the accolade", s1 varchar(255) comment "Complete s-part, including length statement and string", s2 varchar(255), s3 varchar(255), s4 varchar(255) comment "Complete s4-string", s4_len_src smallint comment "Original length of s4", s4_string_src varchar(255) comment "Original s4 string", s4_len_dst smallint comment "New length", s4_string_dst varchar(255) comment "New string", s5 varchar(255), s6 varchar(255) ); ################################################################################ # Copy working data to working table ################################################################################ # set @ddl=concat ( "insert into serial_tmp ( serial_data_src, find_string, replace_string ) select ", col_serial_data_src_in, ", ", col_src_in, ", ", col_dst_in, " ", "from ", table_name_in, ";" ); prepare stmt from @ddl; execute stmt; deallocate prepare stmt; ################################################################################ # Decompose serialised data ################################################################################ # # Leading a-field ######################################## # update serial_tmp set a_len = substring_index(substring_index(serial_data_src, ":", 2), ":", -1), a_string = substring_index(substring_index(serial_data_src, "}",1), "{", -1); # Complete s-fields ######################################## # update serial_tmp set s1 = substring_index(a_string, ";", 1), s2 = substring_index(substring_index(a_string, ";", 2), ";", -1), s3 = substring_index(substring_index(a_string, ";", 3), ";", -1), s4 = substring_index(substring_index(a_string, ";", 4), ";", -1), s5 = substring_index(substring_index(a_string, ";", 5), ";", -1), s6 = substring_index(substring_index(a_string, ";", 6), ";", -1); # Further decompose s4 ######################################## # update serial_tmp set s4_len_src = substring_index(substring_index(s4, ":", 2), ":", -1), s4_string_src = substring_index(substring_index(s4, '"', 2), '"', -1); ################################################################################ # Find-and-replace ################################################################################ # # Finally! The heart of the matter # update serial_tmp set s4_string_dst = replace(s4_string_src, find_string, replace_string), s4_len_dst = length(s4_string_dst); ################################################################################ # Reassemble ################################################################################ # update serial_tmp set serial_data_dst = concat ( "a:",a_len, ":", "{", s1, ";", s2, ";", s3, ";", "s:",s4_len_dst,":", char(34), s4_string_dst, char(34), ";", s5, ";", s6, ";", "}" ); ################################################################################ # Copy results back to the original input table ################################################################################ # # * Fill in the result in column "col_serial_data_dst_in" # set @ddl2=concat ( "update ", table_name_in, " join serial_tmp on serial_tmp.serial_data_src = ", table_name_in, ".",col_serial_data_src_in," set ",table_name_in, ".",col_serial_data_dst_in, " = serial_tmp.serial_data_dst;" ); prepare stmt from @ddl2; execute stmt; deallocate prepare stmt; END
Bronnen
- https://stackoverflow.com/questions/13319638/working-with-serialized-data-in-wordpress
- https://stackoverflow.com/questions/46908387/how-to-use-a-serialized-or-json-encoded-attribute-in-a-mysql-stored-procedure
- https://stackoverflow.com/questions/8922668/updating-base64-encode-serialized-array-on-mysql-databases - unserialize, re-serialize
- https://stackoverflow.com/questions/5335970/unserialize-through-query-at-database-level-itself
- https://stackoverflow.com/questions/62980734/deserialize-and-serialize-data-in-mysql - Same use case as me. I've posted my sproc here
- https://wpengine.com/support/wordpress-serialized-data