Serialised data (WordPress): verschil tussen versies
(→WP-CLI) |
(→WP-CLI) |
||
Regel 74: | Regel 74: | ||
== WP-CLI == | == WP-CLI == | ||
− | + | [[Hoofdpagina#WP-CLI WP-CLI]] can handle serialised data very well. The ''positional parameters'' <code>pluck</code> and <code>patch</code> are for reading and writing parts of serialised data | |
== MySQL == | == MySQL == |
Versie van 29 aug 2022 11:40
Soms bevatten database-velden serialised data: Meerdere gegevens in één veld, die zijn samengevoegd middels een bepaalde syntaxis.
Je kunt zeggen dat serialised data nogal haaks staat op het concept van databases. Aan de andere kant: Het biedt flexibiliteit die anders soms niet mogelijk is. Daarnaast wordt dit juist gebruikt om objecten op te slaan
Voorbeeld
In tabel wp_options
wordt in het veld widget_woocommerce_layered_nav
alle Woocommerce-widgets gecodeerd. Dit is een voorbeeld van een webwinkel in okt. 2020.
Rauwe inhoud:
a:14:{i:4;a:4:{s:5:"title";s:21:"Originele priri";s:9:"attribute";s:21:"originele priri";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:5;a:4:{s:5:"title";s:12:"Breedte (mm)";s:9:"attribute";s:7:"breedte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:6;a:4:{s:5:"title";s:11:"Diepte (mm)";s:9:"attribute";s:6:"diepte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:7;a:4:{s:5:"title";s:11:"Lengte (mm)";s:9:"attribute";s:6:"lengte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:8;a:4:{s:5:"title";s:18:"Automatische stop?";s:9:"attribute";s:17:"automatische_stop";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:9;a:4:{s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:10;a:4:{s:5:"title";s:10:"Connector?";s:9:"attribute";s:9:"connector";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:11;a:4:{s:5:"title";s:5:"Veer?";s:9:"attribute";s:4:"veer";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:12;a:4:{s:5:"title";s:7:"Diagram";s:9:"attribute";s:7:"diagram";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:13;a:4:{s:5:"title";s:19:"Ztoepassing - Model";s:9:"attribute";s:16:"toepassing_model";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:14;a:4:{s:5:"title";s:21:"Toepassing - Submodel";s:9:"attribute";s:19:"toepassing_submodel";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:15;a:4:{s:5:"title";s:17:"Azijn-productcode";s:9:"attribute";s:10:"azijn_code";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}i:16;a:4:{s:5:"title";s:11:"Productlijn";s:9:"attribute";s:11:"productlijn";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and";}s:12:"_multiwidget";i:1;}
De eerste paar regels in meer leesbare vorm:
a:14: { i:4;a:4: { s:5:"title"; s:21:"Originele pripripripr"; s:9:"attribute"; s:21:"originele pripripripr"; s:12:"display_type"; s:4:"list"; s:10:"query_type"; s:3:"and"; } i:5;a:4: { s:5:"title";s:12:"Breedte (mm)";s:9:"attribute";s:7:"breedte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:6;a:4: { s:5:"title";s:11:"Diepte (mm)";s:9:"attribute";s:6:"diepte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:7;a:4: { s:5:"title";s:11:"Lengte (mm)";s:9:"attribute";s:6:"lengte";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:8;a:4: { s:5:"title";s:18:"Automatische stop?";s:9:"attribute";s:17:"automatische_stop";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } i:9;a:4: { s:5:"title";s:12:"Stroomdraad?";s:9:"attribute";s:11:"stroomdraad";s:12:"display_type";s:4:"list";s:10:"query_type";s:3:"and"; } }
PHP: Serialize & unserialize
WordPress gebruikt de PHP-functie serialize()
om data te serializen voor opslag in de database en de functie unserialize
voor de tegenovergestelde bewerking.
Voorbeeld [1]:
$serialized = 'a:3:{i:0;s:5:"apple";i:1;s:6:"banana";i:2;s:6:"orange";}'; var_dump(unserialize($serialized)); Output: Array ( [0] => apple [1] => banana [2] => orange )
WP-CLI
Hoofdpagina#WP-CLI WP-CLI can handle serialised data very well. The positional parameters pluck
and patch
are for reading and writing parts of serialised data
MySQL
MySQL lijkt geen ingebouwde functies te hebben die vergelijkbaar zijn met de PHP-functies serialize en unserialize. Dit had overigens best gekund: MySQL heeft bv. wel functies om met XML om te gaan. Een impressie:
Algemeen
- Tot op heden (2021.03.02) heb ik geen libraries of sprocs of scripts gevonden die dit kunnen [2]. En ook dat had best anders kunnen zijn: Zo ingewikkeld lijkt dit niet te zijn
- Ik heb een primitieve parser voor find-en-replace. Zoiets als de ingebouwde functie
replace
, maar dan voor strings met serialised content met een beperkte vaste structuur - Zie verderop in dit artikel
Recursie
- MySQL kent recursie en limited variable scope - Concepten die vermoedelijk goed van pas komen als ik ooit een flexibelere parser wil bouwen
- https://stackoverflow.com/questions/20215744/how-to-create-a-mysql-hierarchical-recursive-query
Syntaxis
Dankzij het voorbeeld hierboven, kun je al een deel van de syntaxis achterhalen:
- Data-elementen zijn omhuld door
{
en}
a
: Arrayi
: Counter; indexs
: String + lengte. Bv.s:5"apple"
.
Stringlengte
Strings worden voorafgegaan door de lengte van de betreffende string, gemeten in UTF8-bytes. Als dit getal niet klopt, wordt de betreffende serialised data, niet verwerkt.
Voorbeelden:
# The string "0...9" is exactly 10 bytes long: Every character is exactly one byte # s:10:"0123456789"
# The letter "è" takes two bytes, hence the whole string is 7 bytes long, although only 4 characters long # s:7:"Bèèè"
concat('s:27:"Balais de charbon d', char(39), 'origine"') # # * Balais de charbon d → 19 characters & 19 bytes # * char(39) → 1 character & 1 byte # * origine → 7 characters & 7 bytes # # * Total: 27 bytes
Enveloping symbols
- Usually, you can choose to envelop a string within 'apostrophes' or within "quotes"
- However, we have a complication here: The actual string already contains both apostrophes & quotes
- The way to solve this: Use apostrophes for enveloping the string. When an apostrophe is needed inside the string, and use char(39) to represent the apostrophe inside the string
- BTW:
char(39)
counts as one byte
Voorbeeld:
concat('s:27:"Balais de charbon d', char(39), 'origine"') # # * The string (inside the apostrophes) becomes # * s:27:"Balais de charbon d'origine"
MySQL - Sproc replace_serial
Dit was een leuk puzzeltje om te doen, maar het practische nut is helaas beperkt.
CREATE DEFINER=`strompf`@`localhost` PROCEDURE `replace_serial`( in table_name_in tinytext, # Source table name in col_serial_data_src_in tinytext, # Column with source serialised data in col_serial_data_dst_in tinytext, # Column with destination serialised data in col_src_in tinytext, # Column with source text strings in col_dst_in tinytext # Column with desination text string ) BEGIN # # Find-and-replace within serialised content ################################################################################ # # * This sproc "replace_serial" needs as input pointers to a table # ("pointer table") with the following columns: # * Column with the source serialised data # * Column for storing the updated serialised data # * Column with find-string # * Column with replace-strings # * It can be any table: The arguments for replace_serial() contains the name # of this table and the names of those four columns # * Internally, data from this input-table is copied to temporary table # "serial_tmp" # * The designated serialised data is decomposed into several fields (again, # in table "serial_tmp") - Currently, this sproc only works for serialised # data with 1 a-element, followed by 6 s-elements. This sproc has not been # tested with other configurations # * Find-and-replace is only performed on element s4 (=the 4th s-element) # * Subsequently, the new length of s4 is calculated, and the serialised string # is written to the designated column in the "pointer table" # * Subsequently, it's up to the programmer to process the resulting data in # the pointer table # * This approach with a "pointer table" works really well: Easy to grasp, # flexible and easy to debug. It's especially handy for debugging or further # processing that the output is in a separate column from the input. # * Copyright Jeroen Strompf - De Vliegende Brigade - 2021.03.09 # # # History ######################################## # # * 2021.03.09: Created, tested and pushed to "bal_dwh_org" # # ################################################################################ # Define working table "serial_tmp" ################################################################################ # drop table if exists serial_tmp; create temporary table serial_tmp ( serial_data_src varchar(255) comment "Actual serialised data", serial_data_dst varchar(255) comment "Resulting serialised data. This is a separate field, to make debugging easier", find_string varchar(255) comment "String to be found within serial_data_src", replace_string varchar(255) comment "Replacement string for within serial_data_src, resulting in serial_data_dst", a_len smallint, a_string varchar(255) comment "The string within the accolade", s1 varchar(255) comment "Complete s-part, including length statement and string", s2 varchar(255), s3 varchar(255), s4 varchar(255) comment "Complete s4-string", s4_len_src smallint comment "Original length of s4", s4_string_src varchar(255) comment "Original s4 string", s4_len_dst smallint comment "New length", s4_string_dst varchar(255) comment "New string", s5 varchar(255), s6 varchar(255) ); ################################################################################ # Copy working data to working table ################################################################################ # set @ddl=concat ( "insert into serial_tmp ( serial_data_src, find_string, replace_string ) select ", col_serial_data_src_in, ", ", col_src_in, ", ", col_dst_in, " ", "from ", table_name_in, ";" ); prepare stmt from @ddl; execute stmt; deallocate prepare stmt; ################################################################################ # Decompose serialised data ################################################################################ # # Leading a-field ######################################## # update serial_tmp set a_len = substring_index(substring_index(serial_data_src, ":", 2), ":", -1), a_string = substring_index(substring_index(serial_data_src, "}",1), "{", -1); # Complete s-fields ######################################## # update serial_tmp set s1 = substring_index(a_string, ";", 1), s2 = substring_index(substring_index(a_string, ";", 2), ";", -1), s3 = substring_index(substring_index(a_string, ";", 3), ";", -1), s4 = substring_index(substring_index(a_string, ";", 4), ";", -1), s5 = substring_index(substring_index(a_string, ";", 5), ";", -1), s6 = substring_index(substring_index(a_string, ";", 6), ";", -1); # Further decompose s4 ######################################## # update serial_tmp set s4_len_src = substring_index(substring_index(s4, ":", 2), ":", -1), s4_string_src = substring_index(substring_index(s4, '"', 2), '"', -1); ################################################################################ # Find-and-replace ################################################################################ # # Finally! The heart of the matter # update serial_tmp set s4_string_dst = replace(s4_string_src, find_string, replace_string), s4_len_dst = length(s4_string_dst); ################################################################################ # Reassemble ################################################################################ # update serial_tmp set serial_data_dst = concat ( "a:",a_len, ":", "{", s1, ";", s2, ";", s3, ";", "s:",s4_len_dst,":", char(34), s4_string_dst, char(34), ";", s5, ";", s6, ";", "}" ); ################################################################################ # Copy results back to the original input table ################################################################################ # # * Fill in the result in column "col_serial_data_dst_in" # set @ddl2=concat ( "update ", table_name_in, " join serial_tmp on serial_tmp.serial_data_src = ", table_name_in, ".",col_serial_data_src_in," set ",table_name_in, ".",col_serial_data_dst_in, " = serial_tmp.serial_data_dst;" ); prepare stmt from @ddl2; execute stmt; deallocate prepare stmt; END
Bronnen
- https://stackoverflow.com/questions/13319638/working-with-serialized-data-in-wordpress
- https://stackoverflow.com/questions/46908387/how-to-use-a-serialized-or-json-encoded-attribute-in-a-mysql-stored-procedure
- https://stackoverflow.com/questions/8922668/updating-base64-encode-serialized-array-on-mysql-databases - unserialize, re-serialize
- https://stackoverflow.com/questions/5335970/unserialize-through-query-at-database-level-itself
- https://stackoverflow.com/questions/62980734/deserialize-and-serialize-data-in-mysql - Same use case as me. I've posted my sproc here
- https://wpengine.com/support/wordpress-serialized-data