mirror of
https://github.com/KDE/konsole.git
synced 2025-12-23 23:38:08 -05:00
Summary: The uni2characterwidth tool, converts Unicode Character Database files into character width lookup tables. It uses a template file to place the tables in a source code file together with a function for finding the width for specified character. It also allows to generate few forms of lists with width data for debug and test purposes, or for future use as a replacement of Unicode files. Set `KONSOLE_BUILD_UNI2CHARACTERWIDTH` cmake flag to build the tool. Use `--help` argument for more detailed usage. There is a possibility to generate separate "width" for Ambiguous characters. It can be used to add ability to configure the characters width in Konsole settings. The `example.template` file contains all possible named tags, and some additional tags to show how to use them. CCBUG: 396435 Depends on D15756 Test Plan: Download files listed below from `11.0.0` and `emoji/11.0` directories on `https://unicode.org/Public/`. You can also directly use URLs to the files. * UnicodeData.txt * EastAsianWidth.txt * emoji-data.txt Generate any available list except compact-ranges (e.g. `details`): ``` uni2characterwidth \ -U UnicodeData.txt -A EastAsianWidth.txt -E emoji-data.txt \ -g details result.txt ``` The list should contain ranges for all possible widths (-2, -1, 0, 1, 2). You can choose some characters with a width you know and check how they were classified. -2 is a special non-standard width for ambiguous characters, which can be overriden by adding `-a 1` or `-a 2` parameter. With this flag, all ranges from -2 group should disappear and become assigned to selected width (1 or 2). Generate output using a template: ``` uni2characterwidth \ -U UnicodeData.txt -A EastAsianWidth.txt -E emoji-data.txt \ -g code,./template.example result.txt ``` Reviewers: #konsole, hindenburg Reviewed By: #konsole, hindenburg Subscribers: hindenburg, konsole-devel Tags: #konsole Differential Revision: https://phabricator.kde.org/D15757
78 lines
2.6 KiB
Plaintext
78 lines
2.6 KiB
Plaintext
«*COMMENT:----------------------------------------------------------------------
|
|
|
|
Tags:
|
|
|
|
«*anything:comment where everything but closing sequence is allowed:anything*»
|
|
|
|
«NAME:any content, including other tags. \: have to be escaped. It is processed
|
|
using data passed from code() function under NAME key. It should contain other
|
|
tags, without them this text will be replaced with passed data or removed.»
|
|
|
|
«NAME» - like before, used when data should replace it, so content is
|
|
unnecessary
|
|
|
|
EXAMPLE:
|
|
data: Map{ "exampleA", Map{ { "Number", 42 }, { "String", "hello" } } }
|
|
template: «exampleA:number\: «Number», string\: «String»»
|
|
result: number: 42, string: hello
|
|
|
|
«» - empty anonymous element. Used in named elements which receive lists.
|
|
The element will be replaced with list item, and duplicated if
|
|
|
|
«:anonymous container. It should contain some elements which receive data.
|
|
The element will disappear when child element will not receive any value.
|
|
Useful to add suffixes/prefixes to data»
|
|
|
|
EXAMPLE:
|
|
data: Map{ "exampleB", Vector{ 1, 2, 3, 4, 5, 6, 7 } }
|
|
template: «exampleB:«:[«»] »»
|
|
result: [1] [2] [3] [4] [5] [6] [7]
|
|
|
|
data: Map{ "exampleC", Vector{ "a", "b", "c" } }
|
|
template: «exampleC:«:first = «»»«:, second = «»»«:, third = «»»«:, fourth = «»»»
|
|
result: first = a, second = b, third = c
|
|
|
|
«!fmt "XXX":a wrapper which sets printf-like format XXX for numbers and
|
|
strings inside it. Starts with %.»
|
|
|
|
«!repeat N:repeats contents inside N times.»
|
|
|
|
EXAMPLE:
|
|
data: Map{ "exampleD", Vector{ 1, 2, 3, 4, 10, 11, 12, 13 } }
|
|
template: «exampleD:«!fmt "%#.2x":«!repeat 3:«» »«»; »»
|
|
result: 0x01 0x02 0x03 0x04; 0x0a 0x0b 0x0c 0x0d;
|
|
|
|
D: «exampleD:«!fmt "%#.2x":«!repeat 3:«» »«»; »»
|
|
----------------------------------------------------------------------:COMMENT*»
|
|
For available data see code() function. Below are usage examples
|
|
|
|
Warning about generated file - putting "this is a generated file" text in a
|
|
template file could be misleading.
|
|
«gen-file-warning»
|
|
|
|
|
|
Command used to generate the file:
|
|
«cmdline»
|
|
|
|
|
|
Direct LUT - widths of the first 256 code points in direct access array:
|
|
{«!fmt "% d":«direct-lut:
|
|
«!repeat 32:«:«»,»»
|
|
»»}
|
|
|
|
|
|
Arrays with code point ranges for every width:
|
|
«ranges-luts:«:
|
|
«name» = {«!fmt "%#.6x":«ranges:
|
|
«!repeat 8:«:{«first»,«last»},»»
|
|
»»}
|
|
Number of elements in the array: «size»
|
|
|
|
»»
|
|
List of array names, sizes, and widths:
|
|
{«ranges-lut-list:
|
|
«:{«!fmt "% d":«width»», «!fmt "%-16s":«name»», «size»},»
|
|
»}
|
|
Number of elements in the array: «ranges-lut-list-size»;
|
|
|