Serialised data (WordPress): verschil tussen versies
Regel 71: | Regel 71: | ||
) | ) | ||
</pre> | </pre> | ||
+ | |||
+ | == WP-CLI == | ||
+ | |||
+ | De WordPress Command Line Interface kan verrassend goed overweg met serialised data in tabel <code>wp_options</code>. Zie [[WP-CLI (commando's)]] » <code>wp option voor details. | ||
== MySQL == | == MySQL == |
Versie van 29 jul 2021 15:24
Soms bevatten database-velden serialised data: Meerdere gegevens in één veld, die zijn samengevoegd middels een bepaalde syntaxis.
Je kunt zeggen dat serialised data nogal haaks staat op het concept van databases. Aan de andere kant: Het biedt flexibiliteit die anders soms niet mogelijk is. Daarnaast wordt dit juist gebruikt om objecten op te slaan
Voorbeeld
In tabel wp_options
wordt in het veld widget_woocommerce_layered_nav
alle Woocommerce-widgets gecodeerd. Dit is een voorbeeld van een webwinkel in okt. 2020.
Rauwe inhoud:
a:14:{i:4;a:4:{s:5:"title";s:21:"Originele priri";s:9:"attribute";s:21:"originele priri";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:5;a:4:{s:5:"title";s:12:"Breedte (mm)";s:9:"attribute";s:7:"breedte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:6;a:4:{s:5:"title";s:11:"Diepte (mm)";s:9:"attribute";s:6:"diepte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:7;a:4:{s:5:"title";s:11:"Lengte (mm)";s:9:"attribute";s:6:"lengte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:8;a:4:{s:5:"title";s:18:"Automatische stop?";s:9:"attribute";s:17:"automatische_stop";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:9;a:4:{s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:10;a:4:{s:5:"title";s:10:"Connector?";s:9:"attribute";s:9:"connector";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:11;a:4:{s:5:"title";s:5:"Veer?";s:9:"attribute";s:4:"veer";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:12;a:4:{s:5:"title";s:7:"Diagram";s:9:"attribute";s:7:"diagram";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:13;a:4:{s:5:"title";s:19:"Ztoepassing - Model";s:9:"attribute";s:16:"toepassing_model";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:14;a:4:{s:5:"title";s:21:"Toepassing - Submodel";s:9:"attribute";s:19:"toepassing_submodel";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:15;a:4:{s:5:"title";s:17:"Azijn-productcode";s:9:"attribute";s:10:"azijn_code";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:16;a:4:{s:5:"title";s:11:"Productlijn";s:9:"attribute";s:11:"productlijn";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}s:12:"_multiwidget";i:1;}
De eerste paar regels in meer leesbare vorm:
a:14: { i:4;a:4: { s:5:"title"; s:21:"Originele pripripripr"; s:9:"attribute"; s:21:"originele pripripripr"; s:12:"display_type"; s:4:"list"; s:10:"query_type"; s:3:"and"; } i:5;a:4: { s:5:"title";s:12:"Breedte (mm)";s:9:"attribute";s:7:"breedte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:6;a:4: { s:5:"title";s:11:"Diepte (mm)";s:9:"attribute";s:6:"diepte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:7;a:4: { s:5:"title";s:11:"Lengte (mm)";s:9:"attribute";s:6:"lengte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:8;a:4: { s:5:"title";s:18:"Automatische stop?";s:9:"attribute";s:17:"automatische_stop";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:9;a:4: { s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } }
PHP: Serialize & unserialize
WordPress gebruikt de PHP-functie serialize()
om data te serializen voor opslag in de database en de functie unserialize
voor de tegenovergestelde bewerking.
Voorbeeld [1]:
$serialized = 'a:3:{i:0;s:5:"apple";i:1;s:6:"banana";i:2;s:6:"orange";}'; var_dump(unserialize($serialized)); Output: Array ( [0] => apple [1] => banana [2] => orange )
WP-CLI
De WordPress Command Line Interface kan verrassend goed overweg met serialised data in tabel wp_options
. Zie WP-CLI (commando's) » wp option voor details.
MySQL
MySQL lijkt geen ingebouwde functies te hebben die vergelijkbaar zijn met de PHP-functies serialize en unserialize. Dit had overigens best gekund: MySQL heeft bv. wel functies om met XML om te gaan. Een impressie:
Algemeen
- Tot op heden (2021.03.02) heb ik geen libraries of sprocs of scripts gevonden die dit kunnen [2]. En ook dat had best anders kunnen zijn: Zo ingewikkeld lijkt dit niet te zijn
- Ik heb een primitieve parser voor find-en-replace. Zoiets als de ingebouwde functie
replace
, maar dan voor strings met serialised content met een beperkte vaste structuur - Zie verderop in dit artikel
Recursie
- MySQL kent recursie en limited variable scope - Concepten die vermoedelijk goed van pas komen als ik ooit een flexibelere parser wil bouwen
- https://stackoverflow.com/questions/20215744/how-to-create-a-mysql-hierarchical-recursive-query
Syntaxis
Dankzij het voorbeeld hierboven, kun je al een deel van de syntaxis achterhalen:
- Data-elementen zijn omhuld door
{
en }
a
: Array
i
: Counter; index
s
: String + lengte. Bv. s:5"apple"
.
Stringlengte
Strings worden voorafgegaan door de lengte van de betreffende string, gemeten in UTF8-bytes. Als dit getal niet klopt, wordt de betreffende serialised data, niet verwerkt.
Voorbeelden:
# The string "0...9" is exactly 10 bytes long: Every character is exactly one byte
#
s:10:"0123456789"
# The letter "è" takes two bytes, hence the whole string is 7 bytes long, although only 4 characters long
#
s:7:"Bèèè"
concat('s:27:"Balais de charbon d', char(39), 'origine"')
#
# * Balais de charbon d → 19 characters & 19 bytes
# * char(39) → 1 character & 1 byte
# * origine → 7 characters & 7 bytes
#
# * Total: 27 bytes
Enveloping symbols
- Usually, you can choose to envelop a string within 'apostrophes' or within "quotes"
- However, we have a complication here: The actual string already contains both apostrophes & quotes
- The way to solve this: Use apostrophes for enveloping the string. When an apostrophe is needed inside the string, and use char(39) to represent the apostrophe inside the string
- BTW:
char(39)
counts as one byte
Voorbeeld:
concat('s:27:"Balais de charbon d', char(39), 'origine"')
#
# * The string (inside the apostrophes) becomes
# * s:27:"Balais de charbon d'origine"
MySQL - Sproc replace_serial
Dit was een leuk puzzeltje om te doen, maar het practische nut is helaas beperkt.
CREATE DEFINER=`strompf`@`localhost` PROCEDURE `replace_serial`(
in table_name_in tinytext, # Source table name
in col_serial_data_src_in tinytext, # Column with source serialised data
in col_serial_data_dst_in tinytext, # Column with destination serialised data
in col_src_in tinytext, # Column with source text strings
in col_dst_in tinytext # Column with desination text string
)
BEGIN
#
# Find-and-replace within serialised content
################################################################################
#
# * This sproc "replace_serial" needs as input pointers to a table
# ("pointer table") with the following columns:
# * Column with the source serialised data
# * Column for storing the updated serialised data
# * Column with find-string
# * Column with replace-strings
# * It can be any table: The arguments for replace_serial() contains the name
# of this table and the names of those four columns
# * Internally, data from this input-table is copied to temporary table
# "serial_tmp"
# * The designated serialised data is decomposed into several fields (again,
# in table "serial_tmp") - Currently, this sproc only works for serialised
# data with 1 a-element, followed by 6 s-elements. This sproc has not been
# tested with other configurations
# * Find-and-replace is only performed on element s4 (=the 4th s-element)
# * Subsequently, the new length of s4 is calculated, and the serialised string
# is written to the designated column in the "pointer table"
# * Subsequently, it's up to the programmer to process the resulting data in
# the pointer table
# * This approach with a "pointer table" works really well: Easy to grasp,
# flexible and easy to debug. It's especially handy for debugging or further
# processing that the output is in a separate column from the input.
# * Copyright Jeroen Strompf - De Vliegende Brigade - 2021.03.09
#
#
# History
########################################
#
# * 2021.03.09: Created, tested and pushed to "bal_dwh_org"
#
#
################################################################################
# Define working table "serial_tmp"
################################################################################
#
drop table if exists
serial_tmp;
create temporary table
serial_tmp
(
serial_data_src varchar(255) comment "Actual serialised data",
serial_data_dst varchar(255) comment "Resulting serialised data. This is a separate field, to make debugging easier",
find_string varchar(255) comment "String to be found within serial_data_src",
replace_string varchar(255) comment "Replacement string for within serial_data_src, resulting in serial_data_dst",
a_len smallint,
a_string varchar(255) comment "The string within the accolade",
s1 varchar(255) comment "Complete s-part, including length statement and string",
s2 varchar(255),
s3 varchar(255),
s4 varchar(255) comment "Complete s4-string",
s4_len_src smallint comment "Original length of s4",
s4_string_src varchar(255) comment "Original s4 string",
s4_len_dst smallint comment "New length",
s4_string_dst varchar(255) comment "New string",
s5 varchar(255),
s6 varchar(255)
);
################################################################################
# Copy working data to working table
################################################################################
#
set @ddl=concat
(
"insert into serial_tmp
(
serial_data_src,
find_string,
replace_string
)
select ",
col_serial_data_src_in, ", ",
col_src_in, ", ",
col_dst_in, " ",
"from ",
table_name_in, ";"
);
prepare stmt from @ddl;
execute stmt;
deallocate prepare stmt;
################################################################################
# Decompose serialised data
################################################################################
#
# Leading a-field
########################################
#
update
serial_tmp
set
a_len = substring_index(substring_index(serial_data_src, ":", 2), ":", -1),
a_string = substring_index(substring_index(serial_data_src, "}",1), "{", -1);
# Complete s-fields
########################################
#
update
serial_tmp
set
s1 = substring_index(a_string, ";", 1),
s2 = substring_index(substring_index(a_string, ";", 2), ";", -1),
s3 = substring_index(substring_index(a_string, ";", 3), ";", -1),
s4 = substring_index(substring_index(a_string, ";", 4), ";", -1),
s5 = substring_index(substring_index(a_string, ";", 5), ";", -1),
s6 = substring_index(substring_index(a_string, ";", 6), ";", -1);
# Further decompose s4
########################################
#
update
serial_tmp
set
s4_len_src = substring_index(substring_index(s4, ":", 2), ":", -1),
s4_string_src = substring_index(substring_index(s4, '"', 2), '"', -1);
################################################################################
# Find-and-replace
################################################################################
#
# Finally! The heart of the matter
#
update
serial_tmp
set
s4_string_dst = replace(s4_string_src, find_string, replace_string),
s4_len_dst = length(s4_string_dst);
################################################################################
# Reassemble
################################################################################
#
update
serial_tmp
set
serial_data_dst = concat
(
"a:",a_len, ":",
"{",
s1, ";", s2, ";", s3, ";",
"s:",s4_len_dst,":",
char(34),
s4_string_dst,
char(34),
";", s5, ";", s6, ";",
"}"
);
################################################################################
# Copy results back to the original input table
################################################################################
#
# * Fill in the result in column "col_serial_data_dst_in"
#
set @ddl2=concat
(
"update ", table_name_in, "
join
serial_tmp
on
serial_tmp.serial_data_src = ", table_name_in, ".",col_serial_data_src_in,"
set
",table_name_in, ".",col_serial_data_dst_in, " = serial_tmp.serial_data_dst;"
);
prepare stmt from @ddl2;
execute stmt;
deallocate prepare stmt;
END
Bronnen
- https://stackoverflow.com/questions/13319638/working-with-serialized-data-in-wordpress
- https://stackoverflow.com/questions/46908387/how-to-use-a-serialized-or-json-encoded-attribute-in-a-mysql-stored-procedure
- https://stackoverflow.com/questions/8922668/updating-base64-encode-serialized-array-on-mysql-databases - unserialize, re-serialize
- https://stackoverflow.com/questions/5335970/unserialize-through-query-at-database-level-itself
- https://stackoverflow.com/questions/62980734/deserialize-and-serialize-data-in-mysql - Same use case as me. I've posted my sproc here
- https://wpengine.com/support/wordpress-serialized-data