Used to generate a (hopefully) clean JSON glosary file from a messy .txt file cut-and-pasted from somewhere like here.
Fill in infilename
:
In [1]:
infilename = 'slave_status.txt'
outfilename = infilename.split('.')[0] + '.json'
A line containing 1+ term is recognized because it contains a single word, or multiple words with a comma between each. Lines with multiple words separated by only whitespace are understood to be part of the definition of the term above.
TERM1
multiple words of definition
TERM2[, TERM2b, TERM2c,...]
definition for term 2
When multiple terms sit on the same line, like TERM2
, TERM2b
, TERM2c
, they will each go into the glossary with a copy of the same definition.
In [2]:
import re
term_seer = re.compile(r"^(?P<term1>\w+)(?P<moreterms>,\s\w+)*$")
In [3]:
with open(infilename) as infile:
content = infile.read()
gloss = {}
terms = []
defn = []
for line in content.splitlines():
terms_seen = term_seer.search(line)
if terms_seen and terms_seen.group('term1') != 'Note':
if terms:
defn = "\n".join(defn)
for term in terms:
if not term:
sfas
gloss[term] = defn
terms = [terms_seen.group('term1')]
if terms_seen.group('moreterms'):
terms.extend(terms_seen.group('moreterms').replace(","," ").split())
defn = []
else:
defn.append(line)
defn = "\n".join(defn)
for term in terms:
gloss[term] = defn
In [4]:
import json
In [5]:
json.dump(gloss, open(outfilename, 'w'))
In [6]:
!cat {outfilename}
{"Last_Error": "\nThe error number and error message returned by the most recently executed statement. An error number of 0 and message of the empty string mean \u201cno error.\u201d If the Last_Error value is not empty, it also appears as a message in the slave's error log. For example:\n\nLast_Errno: 1051\nLast_Error: error 'Unknown table 'z'' on query 'drop table z'\nThe message indicates that the table z existed on the master and was dropped there, but it did not exist on the slave, so DROP TABLE failed on the slave. (This might occur, for example, if you forget to copy the table to the slave when setting up replication.)\n\nNote\nWhen the slave SQL thread receives an error, it reports the error first, then stops the SQL thread. This means that there is a small window of time during which SHOW SLAVE STATUS shows a nonzero value for Last_Errno even though Slave_SQL_Running still displays Yes.\n", "Seconds_Behind_Master": "\nThis field is an indication of how \u201clate\u201d the slave is:\n\nWhen the slave is actively processing updates, this field shows the difference between the current timestamp on the slave and the original timestamp logged on the master for the most event currently being processed on the slave.\n\nWhen no event is currently being processed on the slave, this value is 0.\n\nIn essence, this field measures the time difference in seconds between the slave SQL thread and the slave I/O thread. If the network connection between master and slave is fast, the slave I/O thread is very close to the master, so this field is a good approximation of how late the slave SQL thread is compared to the master. If the network is slow, this is not a good approximation; the slave SQL thread may quite often be caught up with the slow-reading slave I/O thread, so Seconds_Behind_Master often shows a value of 0, even if the I/O thread is late compared to the master. In other words, this column is useful only for fast networks.\n\nThis time difference computation works even if the master and slave do not have identical clock times, provided that the difference, computed when the slave I/O thread starts, remains constant from then on. Any changes\u2014including NTP updates\u2014can lead to clock skews that can make calculation of Seconds_Behind_Master less reliable.\n\nThis field is NULL (undefined or unknown) if the slave SQL thread is not running, or if the slave I/O thread is not running or is not connected to the master. For example, if the slave I/O thread is running but is not connected to the master and is sleeping for the number of seconds given by the CHANGE MASTER TO statement or --master-connect-retry option (default 60) before reconnecting, the value is NULL. This is because the slave cannot know what the master is doing, and so cannot say reliably how late it is.\n\nThe value of Seconds_Behind_Master is based on the timestamps stored in events, which are preserved through replication. This means that if a master M1 is itself a slave of M0, any event from M1's binary log that originates from M0's binary log has M0's timestamp for that event. This enables MySQL to replicate TIMESTAMP successfully. However, the problem for Seconds_Behind_Master is that if M1 also receives direct updates from clients, the Seconds_Behind_Master value randomly fluctuates because sometimes the last event from M1 originates from M0 and sometimes is the result of a direct update on M1.", "Master_User": "\nThe user name of the account used to connect to the master.\n", "Master_Port": "\nThe port used to connect to the master.\n", "Until_Log_Pos": "\nThe values specified in the UNTIL clause of the START SLAVE statement.\n\nUntil_Condition has these values:\n\nNone if no UNTIL clause was specified\n\nMaster if the slave is reading until a given position in the master's binary log\n\nRelay if the slave is reading until a given position in its relay log\n\nUntil_Log_File and Until_Log_Pos indicate the log file name and position that define the coordinates at which the SQL thread stops executing.\n", "Master_Log_File": "\nThe name of the master binary log file from which the I/O thread is currently reading.\n", "Read_Master_Log_Pos": "\nThe position in the current master binary log file up to which the I/O thread has read.\n", "Replicate_Do_DB": "\nThe lists of databases that were specified with the --replicate-do-db and --replicate-ignore-db options, if any.\n", "Exec_Master_Log_Pos": "\nThe position in the current master binary log file to which the SQL thread has read and executed, marking the start of the next transaction or event to be processed. You can use this value with the CHANGE MASTER TO statement's MASTER_LOG_POS option when starting a new slave from an existing slave, so that the new slave reads from this point. The coordinates given by (Relay_Master_Log_File, Exec_Master_Log_Pos) in the master's binary log correspond to the coordinates given by (Relay_Log_File, Relay_Log_Pos) in the relay log.\n", "Relay_Log_Space": "\nThe total combined size of all existing relay log files.\n", "Relay_Master_Log_File": "\nThe name of the master binary log file containing the most recent event executed by the SQL thread.\n", "Master_SSL_Allowed": "\nThese fields show the SSL parameters used by the slave to connect to the master, if any.\n\nMaster_SSL_Allowed has these values:\n\nYes if an SSL connection to the master is permitted\n\nNo if an SSL connection to the master is not permitted\n\nIgnored if an SSL connection is permitted but the slave server does not have SSL support enabled\n\nThe values of the other SSL-related fields correspond to the values of the MASTER_SSL_CA, MASTER_SSL_CAPATH, MASTER_SSL_CERT, MASTER_SSL_CIPHER, and MASTER_SSL_KEY options to the CHANGE MASTER TO statement. See Section 13.4.2.1, \u201cCHANGE MASTER TO Syntax\u201d.\n", "Slave_IO_State": "\nA copy of the State field of the SHOW PROCESSLIST output for the slave I/O thread. This tells you what the thread is doing: trying to connect to the master, waiting for events from the master, reconnecting to the master, and so on. For a listing of possible states, see Section 8.10.6, \u201cReplication Slave I/O Thread States\u201d. For versions of MySQL prior to 5.0.12, it is necessary to check this field for connection problems. In those versions, the thread could be running while unsuccessfully trying to connect to the master; only this field makes you aware of the problem. The state of the SQL thread is not copied because it is simpler. If it is running, there is no problem; if it is not, you can find the error in the Last_Error field (described later).\n\n", "Relay_Log_File": "\nThe name of the relay log file from which the SQL thread is currently reading and executing.\n", "Replicate_Ignore_DB": "\nThe lists of databases that were specified with the --replicate-do-db and --replicate-ignore-db options, if any.\n", "Until_Condition": "\nThe values specified in the UNTIL clause of the START SLAVE statement.\n\nUntil_Condition has these values:\n\nNone if no UNTIL clause was specified\n\nMaster if the slave is reading until a given position in the master's binary log\n\nRelay if the slave is reading until a given position in its relay log\n\nUntil_Log_File and Until_Log_Pos indicate the log file name and position that define the coordinates at which the SQL thread stops executing.\n", "Replicate_Do_Table": "\nThe lists of tables that were specified with the --replicate-do-table, --replicate-ignore-table, --replicate-wild-do-table, and --replicate-wild-ignore-table options, if any.\n", "Last_Errno": "\nThe error number and error message returned by the most recently executed statement. An error number of 0 and message of the empty string mean \u201cno error.\u201d If the Last_Error value is not empty, it also appears as a message in the slave's error log. For example:\n\nLast_Errno: 1051\nLast_Error: error 'Unknown table 'z'' on query 'drop table z'\nThe message indicates that the table z existed on the master and was dropped there, but it did not exist on the slave, so DROP TABLE failed on the slave. (This might occur, for example, if you forget to copy the table to the slave when setting up replication.)\n\nNote\nWhen the slave SQL thread receives an error, it reports the error first, then stops the SQL thread. This means that there is a small window of time during which SHOW SLAVE STATUS shows a nonzero value for Last_Errno even though Slave_SQL_Running still displays Yes.\n", "Master_Host": "\nThe master host that the slave is connected to.\n\n", "Master_SSL_Key": "\nThese fields show the SSL parameters used by the slave to connect to the master, if any.\n\nMaster_SSL_Allowed has these values:\n\nYes if an SSL connection to the master is permitted\n\nNo if an SSL connection to the master is not permitted\n\nIgnored if an SSL connection is permitted but the slave server does not have SSL support enabled\n\nThe values of the other SSL-related fields correspond to the values of the MASTER_SSL_CA, MASTER_SSL_CAPATH, MASTER_SSL_CERT, MASTER_SSL_CIPHER, and MASTER_SSL_KEY options to the CHANGE MASTER TO statement. See Section 13.4.2.1, \u201cCHANGE MASTER TO Syntax\u201d.\n", "Skip_Counter": "\nThe current value of the sql_slave_skip_counter system variable. See Section 13.4.2.6, \u201cSET GLOBAL sql_slave_skip_counter Syntax\u201d.\n", "Slave_SQL_Running": "\nWhether the SQL thread is started.\n", "Relay_Log_Pos": "\nThe position in the current relay log file up to which the SQL thread has read and executed.\n", "Slave_IO_Running": "\nWhether the I/O thread is started and has connected successfully to the master. Internally, the state of this thread is represented by one of the following three values:\n\nMYSQL_SLAVE_NOT_RUN. The slave I/O thread is not running. For this state, Slave_IO_Running is No.\n\nMYSQL_SLAVE_RUN_NOT_CONNECT. The slave I/O thread is running, but is not connected to a replication master. For this state, Slave_IO_Running depends on the server version as shown in the following table.\n\nMySQL Version\tSlave_IO_Running\n4.1 (4.1.13 and earlier); 5.0 (5.0.11 and earlier)\tYes\n4.1 (4.1.14 and later); 5.0 (5.0.12 and later)\tNo\n5.1\tNo\n5.5\tConnecting\nMYSQL_SLAVE_RUN_CONNECT. The slave I/O thread is running, and is connected to a replication master. For this state, Slave_IO_Running is Yes.\n", "Connect_Retry": "\nThe number of seconds between connect retries (default 60). This can be set with the CHANGE MASTER TO statement or --master-connect-retry option.\n", "Replicate_Wild_Ignore_Table": "\nThe lists of tables that were specified with the --replicate-do-table, --replicate-ignore-table, --replicate-wild-do-table, and --replicate-wild-ignore-table options, if any.\n"}
In [ ]:
Content source: catherinedevlin/py-glossarize
Similar notebooks: