Compare commits

...

214 Commits

Author SHA1 Message Date
Jakob Borg
3b34895ae6 LocalVersion can move backwards as well as forwards 2014-07-23 13:03:52 +02:00
Jakob Borg
91cc84c4e6 Hand incoming indexes on main goroutine (this should be fine now) 2014-07-23 13:03:36 +02:00
Jakob Borg
797e53c5ba Merge branch 'v0.8'
* v0.8:
  Handle WANPPPConnection devices (fixes #431)
  Revert "Add temporary debug logging for #344 (revert later)"
  incomingIndexes should not be a package variable (fixes #344)
  Continue discovery on connect errors (fixes #324)

Conflicts:
	files/set.go
	model/model.go
	protocol/protocol.go
2014-07-23 12:00:54 +02:00
Jakob Borg
c714a12ad7 Improve protocol & leveldb debugging 2014-07-23 11:55:55 +02:00
Jakob Borg
08ce9b09ec Test and fix reconnects during pull 2014-07-23 10:52:07 +02:00
Jakob Borg
3152152ed9 Always build discosrv by default 2014-07-23 08:42:49 +02:00
Jakob Borg
544fea51b0 Update all deps to latest version 2014-07-23 08:31:36 +02:00
Jakob Borg
08ca9f9378 Consolidate cmds in cmd/ 2014-07-23 08:31:13 +02:00
Jakob Borg
978f68b744 Update deps to unfail tests 2014-07-23 07:59:45 +02:00
Jakob Borg
680896e4c4 Merge pull request #433 from AudriusButkevicius/dup
Remove non-existing nodes from repositories
2014-07-23 07:58:03 +02:00
Jakob Borg
975627af2e Add AudriusButkevicius 2014-07-23 07:57:37 +02:00
Audrius Butkevicius
b208102b98 Remove non-existing nodes from repositories 2014-07-22 22:29:44 +01:00
Jakob Borg
88a063434c Handle WANPPPConnection devices (fixes #431) 2014-07-22 22:47:54 +02:00
Jakob Borg
bc0a8fcc1d Use language from query parameter 2014-07-22 20:27:36 +02:00
Jakob Borg
3b4fe19dfb Use compiled in assets for those not in STGUIASSETS dir 2014-07-22 20:11:36 +02:00
Jakob Borg
58cc108c0c Handle WANPPPConnection devices (fixes #431) 2014-07-22 19:23:43 +02:00
Jakob Borg
d3085a4127 Always ignore directory modification time (that stuff is nasty) 2014-07-21 10:50:15 +02:00
Jakob Borg
0fcc193197 Handle disconnected nodes better in puller 2014-07-21 10:50:10 +02:00
Jakob Borg
75d4d2df8b Remove SyncOrder, at least temporarily (sorry fREW)
Doesn't actually work very well with the batched approach to needed
files, not documented, not exposed in UI. I'll be happy to reintegrate
if this is solved.
2014-07-21 10:49:18 +02:00
Jakob Borg
28f2e8f24d Allow beta versions 2014-07-20 21:39:52 +02:00
Jakob Borg
f692e3ac73 Basic GUI translation support.
Conflicts:
	gui/index.html
2014-07-20 13:49:26 +02:00
Jakob Borg
bcb5f6f472 Remove unused comparison functions. 2014-07-18 11:43:42 +02:00
Jakob Borg
74fd4a3722 Tick version clock on received changes. 2014-07-18 11:41:51 +02:00
Jakob Borg
884bb638bc Fix locking screwup 2014-07-18 10:00:20 +02:00
Jakob Borg
3388d5b49c Use backend service to verify nodeID (fixes #418) 2014-07-18 10:00:02 +02:00
Jakob Borg
4fe2992924 Repair tests for latest changes 2014-07-17 14:48:02 +02:00
Jakob Borg
0a804e39a8 Fix scan interval for slow scans 2014-07-17 13:47:46 +02:00
Jakob Borg
f88a7a8e6a Publish more event details 2014-07-17 13:47:46 +02:00
Jakob Borg
ec212f73eb Tick version clock on load 2014-07-17 11:13:23 +02:00
Jakob Borg
91cc0cd05e Load localVersion for all nodes 2014-07-17 11:08:03 +02:00
Jakob Borg
7943902d73 Handle needed files in batches 2014-07-15 17:54:00 +02:00
Jakob Borg
32a5e83612 Avoid buffering the entire file list during walks 2014-07-15 14:27:46 +02:00
Jakob Borg
8b349945de Add Local Version field to files, send index in segments. 2014-07-15 13:04:37 +02:00
Jakob Borg
fccdd85cc1 Set TCP options on connections 2014-07-15 12:12:44 +02:00
Jakob Borg
bd2b5db8f3 Don't creash when replacing with empty file set 2014-07-15 00:06:54 +02:00
Jakob Borg
44bc5fd784 Pick up resurrected dirs 2014-07-14 23:59:11 +02:00
Jakob Borg
45dfd616cb Pick up dirs without a CurrentFiler 2014-07-14 23:58:37 +02:00
Jakob Borg
39a691a7e6 Remove compression 2014-07-14 23:52:11 +02:00
Jakob Borg
35b5999cba Refactor modals into template 2014-07-14 14:14:26 +02:00
Jakob Borg
d812f559ef Upgrade from within GUI (fixes #190) 2014-07-14 12:42:29 +02:00
Jakob Borg
54a1f37bf5 stevents: Print raw JSON 2014-07-13 21:39:35 +02:00
Jakob Borg
b0f46beffb Basic events interface 2014-07-13 21:07:24 +02:00
Jakob Borg
c844991cba Tests for previous commit 2014-07-13 21:07:04 +02:00
Jakob Borg
b7cf8a471f New port number for new format global discovery 2014-07-13 09:36:22 +02:00
Jakob Borg
864bb8bc34 Regenerate XDR 2014-07-13 09:24:25 +02:00
Jakob Borg
2d4b89a8e9 Slightly clean up XDR generator 2014-07-13 09:23:10 +02:00
Jakob Borg
a6d67d30f5 Fix XDR handling of int16 2014-07-13 09:16:40 +02:00
Jakob Borg
0a633c526f Copyright wording 2014-07-13 01:07:49 +02:00
Jakob Borg
655acb4cb2 Deprecate scanner.Block & File 2014-07-12 23:09:47 +02:00
Jakob Borg
91b35118d9 Don't go-install genxdr 2014-07-12 20:08:55 +02:00
Jakob Borg
c64321df47 Portable new line converter 2014-07-12 19:49:25 +02:00
Jakob Borg
3f791b57ce Temporarily remove solaris build 2014-07-12 19:49:25 +02:00
Jakob Borg
8de2a7f4c8 go vet is a test step 2014-07-12 19:49:25 +02:00
Jakob Borg
dbb4b67205 Move .stversions to repo root only (fixes #364) 2014-07-11 12:02:53 +02:00
Jakob Borg
f510f5f205 Refactor and improve integration tests 2014-07-11 12:02:53 +02:00
Jakob Borg
620eeae4a7 Tests to clarify glob patterns 2014-07-09 09:24:20 +02:00
Jakob Borg
50b37f1366 Revert "Add temporary debug logging for #344 (revert later)"
This reverts commit 5353659f9f.
2014-07-08 11:49:28 +02:00
Jakob Borg
a7b6e35467 incomingIndexes should not be a package variable (fixes #344) 2014-07-08 11:49:11 +02:00
Jakob Borg
4cf04a3e0d About dialog 2014-07-07 12:59:09 +02:00
Jakob Borg
27cd6e60f4 Fix localsize 2014-07-06 23:15:28 +02:00
Jakob Borg
2b9fc0fd43 Update all deps 2014-07-06 23:13:10 +02:00
Jakob Borg
d6c058c407 Ignore index 2014-07-06 19:22:07 +02:00
Jakob Borg
8fe5438b59 Don't need read lock in files/set 2014-07-06 19:21:58 +02:00
Jakob Borg
e937e51476 Add AppendXDR to XDR types, build.sh xdr 2014-07-06 19:21:37 +02:00
Jakob Borg
b7ea695caf CSRF protection should only cover /rest 2014-07-06 15:00:44 +02:00
Jakob Borg
31350b4352 Use LevelDB storage backend 2014-07-06 14:46:48 +02:00
Ben Sidhom
37d83a4e2e Continue discovery on connect errors (fixes #324)
Continues trying to connect to the discovery server at regular intervals despite
failure. Whether or not to retry and retry interval should be specified in
configuration (not currently in this fix).
2014-07-05 23:10:11 +02:00
Jakob Borg
4a88d1244d Merge branch 'bsidhom-master'
* bsidhom-master:
  Continue discovery on connect errors (fixes #324)
2014-07-05 23:01:37 +02:00
Jakob Borg
439049f672 Add bsidhom 2014-07-05 23:01:31 +02:00
Jakob Borg
ee10295d04 Remove martini, use standard http mux 2014-07-05 21:40:29 +02:00
Jakob Borg
2d272a3cac Bump max file size and count 2014-07-05 11:05:45 +02:00
Ben Sidhom
2b26891062 Continue discovery on connect errors (fixes #324)
Continues trying to connect to the discovery server at regular intervals despite
failure. Whether or not to retry and retry interval should be specified in
configuration (not currently in this fix).
2014-07-04 13:47:54 -07:00
Jakob Borg
3d7d4d845a Luhn error checking 2014-07-04 16:16:50 +02:00
Jakob Borg
c488179783 Correct Luhn alphabet names 2014-07-04 15:58:20 +02:00
Jakob Borg
cfb33321b0 Luhn docs 2014-07-04 15:56:33 +02:00
Jakob Borg
193cea95ce Revert "Add temporary debug logging for #344 (revert later)"
This reverts commit 5353659f9f.
2014-07-04 15:20:29 +02:00
Jakob Borg
3c4002e149 Merge branch 'v0.8'
* v0.8:
  Don't leak writer and index goroutines on close
  Clean up protocol locking and closing
  Send initial index in batches
  Always send initial index, even if empty (ref #344)
  Simplify locking in protocol.Index
  Protocol state machine on receiving side
  Log client version on connect
  Handle query parameters in UPnP control URL (fixes #211)
  Avoid deadlock during initial scan (fixes #389)
  Add temporary debug logging for #344 (revert later)
  Tone down UPnP not found message (fixes #406)
2014-07-04 15:16:41 +02:00
Jakob Borg
a720f90a70 Don't leak writer and index goroutines on close 2014-07-04 15:16:33 +02:00
Jakob Borg
4a6b43bcae Clean up protocol locking and closing 2014-07-03 13:37:20 +02:00
Jakob Borg
2f5a822ca4 Send initial index in batches 2014-07-03 12:30:10 +02:00
Jakob Borg
bc1d04f0b9 Always send initial index, even if empty (ref #344) 2014-07-02 21:50:11 +02:00
Jakob Borg
381795d6d0 Simplify locking in protocol.Index 2014-07-02 21:49:24 +02:00
Jakob Borg
6ade27641d Protocol state machine on receiving side 2014-07-02 21:33:30 +02:00
Jakob Borg
53898d2c60 Log client version on connect 2014-07-02 20:43:43 +02:00
Jakob Borg
91c4ff6009 Handle query parameters in UPnP control URL (fixes #211) 2014-07-02 20:28:03 +02:00
Jakob Borg
0aa067a726 Avoid deadlock during initial scan (fixes #389) 2014-07-02 07:40:27 +02:00
Jakob Borg
67445a6dda Refactor logo (fixes #403) 2014-07-01 22:20:18 +02:00
Jakob Borg
5353659f9f Add temporary debug logging for #344 (revert later) 2014-07-01 17:08:14 +02:00
Jakob Borg
7ac00e189b Tone down UPnP not found message (fixes #406) 2014-07-01 17:06:07 +02:00
Jakob Borg
071f4c0769 Remove reprecated st* utils 2014-07-01 12:20:25 +02:00
Jakob Borg
b57f4ed97e Improve XDR performance 2014-06-30 13:35:48 +02:00
Jakob Borg
7633b9672f XDR incorrect encoding of uint16; tests and benchmarks 2014-06-30 12:56:09 +02:00
Jakob Borg
4f6ee7c8eb Fix linux/freebsd/windows compilation 2014-06-30 01:51:58 +02:00
Jakob Borg
d7cc48eab2 Merge branch 'v0.8'
* v0.8:
  Increase deadlock timeout, make configurable (fixes #389, fixes #393)
  Remove spurious debug output in .stignore handling
  Connection notices are informational
  No need to hold a write lock in Override
  Don't whine about unexpected EOFs
  Ensure correct version string format

Conflicts:
	model/model.go
2014-06-30 01:47:32 +02:00
Jakob Borg
8f3effed32 Refactor node ID handling, use check digits (fixes #269)
New node ID:s contain four Luhn check digits and are grouped
differently. Code uses NodeID type instead of string, so it's formatted
homogenously everywhere.
2014-06-30 01:42:03 +02:00
Jakob Borg
fee8289c0a discosrv: Tunable limiter settings 2014-06-27 22:39:03 +02:00
Jakob Borg
a2da31056b Increase deadlock timeout, make configurable (fixes #389, fixes #393) 2014-06-26 11:29:41 +02:00
Jakob Borg
f97dd9d8d3 Logger should use stdout instead of stderr 2014-06-23 21:57:22 +02:00
Jakob Borg
2383579a64 Remove spurious debug output in .stignore handling 2014-06-23 21:54:28 +02:00
Jakob Borg
68750211ef Connection notices are informational 2014-06-23 15:38:37 +02:00
Jakob Borg
db3e3ade80 No need to hold a write lock in Override 2014-06-23 11:52:13 +02:00
Jakob Borg
e6f04ed238 Don't whine about unexpected EOFs 2014-06-23 10:52:09 +02:00
Jakob Borg
a6eb690e31 Ensure correct version string format 2014-06-23 10:40:09 +02:00
Jakob Borg
21518adfc8 Include MaxVersion in Cluster Config message 2014-06-23 09:31:59 +02:00
Jakob Borg
77fe8449ba Test script for REST interface 2014-06-22 18:18:21 +02:00
Jakob Borg
33e9a35f08 Don't deadlock on connect close while sending Index (fixes #386) 2014-06-22 08:17:58 +02:00
Jakob Borg
4ab4816556 Detect deadlock in model and panic 2014-06-21 12:35:53 +02:00
Jakob Borg
8e8a579bb2 Asset update for previous commit 2014-06-20 11:40:38 +02:00
Jakob Borg
efbdf72d20 Lower CPU usage at idle by reducing db polling 2014-06-20 00:28:45 +02:00
Jakob Borg
0e59b5678a Further clarify message ordering requirements (ref #377) 2014-06-19 01:59:58 +02:00
Jakob Borg
de75550415 Clarify requirements on config messages (ref #377) 2014-06-19 01:27:03 +02:00
Jakob Borg
4dbce32738 Simplify memory handling 2014-06-19 01:02:32 +02:00
Jakob Borg
b05fcbc9d7 Simplify usage reporting config options (fixes #370) 2014-06-18 12:54:30 +02:00
Jakob Borg
d09c71b688 Avoid build error in Go1.2 2014-06-18 11:02:59 +02:00
Jakob Borg
874d6760d4 Handle .stignore correctly on Windows (fixes #369) 2014-06-16 16:19:14 +02:00
Jakob Borg
26ebbee877 Hard override on changes from master repo 2014-06-16 10:47:02 +02:00
Jakob Borg
12eda0449a Build and memSize impl for Solaris 2014-06-16 10:19:32 +02:00
Jakob Borg
5a98f4e47c Mark repos with missing dir as invalid on startup (fixes #311) 2014-06-16 09:33:52 +02:00
Jakob Borg
964c903a68 Only keep track of version (not modified) for sent index 2014-06-16 07:40:17 +02:00
Jakob Borg
21b699826d Increase reconnect delay towards max 2014-06-15 20:32:26 +02:00
Jakob Borg
5fa8f8e50c Remove old index files on startup (fixes #366) 2014-06-15 20:31:26 +02:00
Jakob Borg
9ca87f5314 Don't attempt to use broadcast with IPv6 (ref #346) 2014-06-14 11:14:37 +02:00
Jakob Borg
537c6b3b69 Reduce ping time & timeout (ref #358) 2014-06-14 11:07:34 +02:00
Jakob Borg
48a3fac2da Show out of sync items, rename files->items (fixes #312, fixes #352) 2014-06-14 10:58:36 +02:00
Jakob Borg
fd73682806 Don't need to sync deletes for nonexistent files 2014-06-14 10:55:44 +02:00
Jakob Borg
34bd5b9dcf Better android detection 2014-06-13 20:45:57 +02:00
Jakob Borg
58c5e46206 Add build environment variable 2014-06-13 20:44:00 +02:00
Jakob Borg
4c61ab0f18 Request restart for GUI setting changes 2014-06-13 20:25:10 +02:00
Jakob Borg
f241b63e0e Logo with text 2014-06-13 01:57:03 +02:00
Jakob Borg
2ffdb5a82a Actually generate random certificate serials (fixes #361) 2014-06-13 01:49:30 +02:00
Jakob Borg
46e963443d Include system RAM size in usage report 2014-06-12 20:47:46 +02:00
Jakob Borg
66d4e9e5d7 Prevent possible reordering of Index/IndexUpdate on send (ref #344) 2014-06-12 18:07:06 +02:00
Jakob Borg
de382e33a3 Forget go1.2 2014-06-12 02:28:03 +02:00
Jakob Borg
3c6738da73 Limit damage of previous commit to ARM arch 2014-06-12 01:11:04 +02:00
Jakob Borg
18e5cb6793 Work around broken DNS on Android for usage reporting 2014-06-12 01:05:00 +02:00
Jakob Borg
9cd6b85c09 Remove dead code from previous commit 2014-06-11 22:29:49 +02:00
Jakob Borg
f40f3b3b7b Anonymous Usage Reporting 2014-06-11 20:06:53 +02:00
Jakob Borg
7454670b0a Drop and warn about non-normalized file names on Linux/Windows (fixes #329) 2014-06-11 17:51:31 +02:00
Jakob Borg
e63596681d Fix header in protocol spec (fixes #360) 2014-06-11 16:27:39 +02:00
Jakob Borg
3dbaa76dcb Fix embarrasing badge :) 2014-06-10 17:23:00 +02:00
Jakob Borg
8752003b50 Add embarassing badge 2014-06-10 17:05:15 +02:00
Jakob Borg
8716ed5aa4 Fix coveralls.io data pushing 2014-06-10 17:05:15 +02:00
Jakob Borg
38ac4e8f79 Serialize incoming indexes (fixes #344) 2014-06-10 17:05:15 +02:00
Arthur Axel 'fREW' Schmidt
70fc8a3064 push test coverage info to coveralls.io 2014-06-10 17:05:15 +02:00
Jakob Borg
7626c5d526 Merge pull request #357 from jpjp/patch-1
Change Name -> Node Name to match Add Repo dialog.
2014-06-10 16:09:22 +02:00
Jakob Borg
7e04c9d048 Information about HTTP certificate issues 2014-06-10 15:40:21 +02:00
jpjp
9eda8f2c7e Change Name -> Node Name to match Add Repo dialog. 2014-06-10 13:46:29 +02:00
Jakob Borg
456d9e870d Integration test, API key 2014-06-08 19:17:42 +02:00
Jakob Borg
a1533696a5 Travis badge 2014-06-08 07:40:57 +02:00
Jakob Borg
92499af323 Revert "Build for Solaris"
This reverts commit 5a2328d9a5.
2014-06-08 07:37:51 +02:00
Arthur Axel 'fREW' Schmidt
b2988cdd35 test against travis-ci 2014-06-08 07:37:42 +02:00
Arthur Axel 'fREW' Schmidt
82cfd37263 Allow prioritization of downloads based on name (fixes #174) 2014-06-08 07:16:25 +02:00
Jakob Borg
df381fd03f Let server side decide if restart is needed on config change 2014-06-07 04:00:46 +02:00
Jakob Borg
5a2328d9a5 Build for Solaris 2014-06-07 03:56:13 +02:00
Jakob Borg
b2f66cfb60 Reject index for existing repo from unshared node (fixes #342) 2014-06-06 21:48:29 +02:00
Jakob Borg
6d24e4f122 Test case for #342 2014-06-06 21:40:04 +02:00
Jakob Borg
2e2185165c Improve test suite, fix bug in Set.Global() 2014-06-05 15:32:11 +02:00
Jakob Borg
f0612e57c2 Integration tests with API key 2014-06-05 11:48:22 +02:00
Jakob Borg
e5d16ed08a Remove extra whitespace around node ID (fixes #335) 2014-06-05 11:29:05 +02:00
Jakob Borg
1cff9ccc63 API key change should take effect on restart only 2014-06-05 09:16:12 +02:00
Jakob Borg
20a018db2e Implement API keys 2014-06-04 22:00:55 +02:00
Jakob Borg
80c2b32b92 Implement CSRF protection for REST interface (fixes #287) 2014-06-04 21:20:07 +02:00
Jakob Borg
028e9bc17a Tweak Shared With wording 2014-06-04 15:05:23 +02:00
Jakob Borg
afc2d6fda4 Clarify repo mismatch message (fixes #331) 2014-06-04 14:17:48 +02:00
Jakob Borg
bec5c76631 Use unique name and O_EXCL for temporary indexes (fixes #332) 2014-06-04 13:43:59 +02:00
Jakob Borg
d87051ca99 Correct index save warning formatting (again) and change to info level 2014-06-04 10:54:29 +02:00
Jakob Borg
3798cebad0 Configurable log prefixing (fixes #278) 2014-06-04 10:24:30 +02:00
Jakob Borg
a477989950 Handle invalid file names (Windows) (fixes #238) 2014-06-04 10:09:27 +02:00
Jakob Borg
5065d1d0b4 Fix spurious xdr debug logging 2014-06-04 10:08:25 +02:00
Jakob Borg
829990c9ef Correct warning formatting 2014-06-03 09:38:41 +02:00
Jakob Borg
ac037e0fa3 Use clean node/repo href/id:s (fixes #317) 2014-06-02 23:30:53 +02:00
Jakob Borg
da42d51008 Merge pull request #320 from jedie/fix_small_screen2
CSS fix for small screen sizes, e.g. on mobile phones
2014-06-02 13:51:03 +02:00
JensDiemer
99027813ef CSS fix for small screen sizes, e.g. on mobile phones 2014-06-02 13:41:33 +02:00
Jakob Borg
9112ba8f0b Upper case should be valid in repo ID 2014-06-02 09:56:34 +02:00
Jakob Borg
843fd9bdbd Add license header 2014-06-01 22:50:14 +02:00
Jakob Borg
26c33c4a69 Remove obsolete mctest 2014-06-01 22:47:50 +02:00
Jakob Borg
2db76ae786 Total wire data should always be uint64 (fixes #315) 2014-06-01 21:56:05 +02:00
Jakob Borg
a0b15d006d Handle write errors while saving index cache 2014-05-31 23:45:27 +02:00
Jakob Borg
23b27fa24a Better XDR diagnostics 2014-05-31 23:45:27 +02:00
Jakob Borg
b6f580cbc2 Merge pull request #314 from cmtonkinson/master
case change in documentation
2014-05-31 23:40:32 +02:00
Chris Tonkinson
f2459ef331 case change in documentation 2014-05-31 11:04:25 -04:00
Jakob Borg
0a37fac794 Catch escaped debug print 2014-05-28 20:45:29 +02:00
Jakob Borg
2d9a822ed7 Text files in zip dists should be DOS format 2014-05-28 20:11:01 +02:00
Jakob Borg
98622ca4d0 Include CONTRIBUTORS in build, since LICENSE points to it 2014-05-28 20:11:01 +02:00
Jakob Borg
f7a25adcbd Check for error in directory walker (ref #308) 2014-05-28 20:11:01 +02:00
Jakob Borg
9bf13b253c Update GUI assets 2014-05-28 20:11:01 +02:00
Jakob Borg
2e8b639a34 Merge pull request #307 from KayoticSully/master
GUI will switch between http and https protocols on restart (fixes #252)
2014-05-28 20:09:38 +02:00
Ryan Sullivan
672f7a010f reverted 'use strict' 2014-05-28 14:06:48 -04:00
Ryan Sullivan
37e15c4368 forgot to update assets 2014-05-28 11:29:56 -04:00
Ryan Sullivan
4d7837ba96 Reset protocolChanged just incase 2014-05-28 11:29:08 -04:00
Ryan Sullivan
a6c8423905 Merge remote-tracking branch 'upstream/master' 2014-05-28 11:27:58 -04:00
Ryan Sullivan
832ed556d9 Resolved issue #252. Page will now refresh the protocol if it is changed 2014-05-28 11:26:38 -04:00
Jakob Borg
7c6fb018ca Fix UPnP line endings (ref #211) 2014-05-28 16:04:20 +02:00
Jakob Borg
9c5c06bf31 Update GUI assets 2014-05-28 14:27:08 +02:00
Jakob Borg
61e3daaead Add shortcut for syncing identical files 2014-05-28 14:27:08 +02:00
Jakob Borg
9c0fde795e Update test for relaxed compareClusterConfig 2014-05-28 14:27:08 +02:00
Jakob Borg
ce4f565e2f Add forgotten file 2014-05-28 14:27:08 +02:00
Jakob Borg
5369a62fd5 Allow repo mismatches to proceed (ref #223) 2014-05-28 12:39:33 +02:00
Jakob Borg
b44016ff70 Don't ping timeout during long transfers (fixes #280) 2014-05-28 13:25:06 +02:00
Jakob Borg
9f76c87880 Merge pull request #305 from jedie/versioning_name
"Simple File Versioning" -> "File Versioning"
2014-05-28 11:25:34 +02:00
Jakob Borg
42ae2898e1 Revert "More memory efficient index sending"
This reverts commit 593f098276.
2014-05-28 10:11:17 +02:00
JensDiemer
dd649a6be4 "Simple File Versioning" -> "File Versioning"
see: http://discourse.syncthing.net/t/v0-8-10-simple-file-versioning/259/7?u=jedie
2014-05-28 10:03:56 +02:00
Jakob Borg
593f098276 More memory efficient index sending 2014-05-28 09:31:46 +02:00
Jakob Borg
4a87221f16 Silence Windows chtime warnings (fixes #288) 2014-05-28 09:27:00 +02:00
Jakob Borg
7745ed34d3 Don't stop discovery on send errors (fixes #240) 2014-05-28 07:03:47 +02:00
Jakob Borg
8fe546c4a2 Don't start repo with non-directory root (fixes #276) 2014-05-28 06:55:30 +02:00
Jakob Borg
381f6aeaf6 Handle and prevent invalid repo ID. Validate node ID format. (fixes #286) 2014-05-28 05:27:34 +02:00
Jakob Borg
9154bacced Recompile assets for previous 2014-05-27 11:17:22 +02:00
Jakob Borg
dc0dc8efb4 Merge pull request #301 from jedie/reformat_table2
reformat "folder" and "shared with" table items
2014-05-27 11:12:21 +02:00
JensDiemer
b062d5dd7f reformat "folder" and "shared with" table items
using white-space:nowrap;
2014-05-27 10:58:55 +02:00
Jakob Borg
c519e582b5 Expand tilde on Windows as well (fixes #289) 2014-05-26 16:58:03 +02:00
Jakob Borg
6b9dce36bf Default listen host should be 0.0.0.0 (again) (ref #216) 2014-05-26 15:01:04 +02:00
Jakob Borg
8e0520887a Send default permissions 0777 on directories when IgnorePerms set (ref #284) 2014-05-26 11:09:35 +02:00
Jakob Borg
cfd1fdb38e Don't set permissions 000 on directories with NoPermissionBits set (ref #284) 2014-05-26 11:08:54 +02:00
365 changed files with 60247 additions and 6269 deletions

6
.gitignore vendored
View File

@@ -1,10 +1,10 @@
syncthing
syncthing.exe
stcli
stcli.exe
*.tar.gz
*.zip
*.asc
*.sublime*
discosrv
stpidx
.jshintrc
coverage.out
files/pidx

20
.travis.yml Normal file
View File

@@ -0,0 +1,20 @@
language: go
go:
- tip
install:
- export PATH=$PATH:$HOME/gopath/bin
- ./build.sh setup
- go get code.google.com/p/go.tools/cmd/cover
- go get github.com/mattn/goveralls
script:
- ./build.sh test-cov
after_success:
- goveralls -coverprofile=coverage.out -service=travis-ci -package=calmh/syncthing -repotoken="$COVERALS_TOKEN"
env:
global:
secure: "zEV2h2XtKHNLVdXJjM4LA/VjMfLVydm6goF+ARit+nOSGxGoH7f7jIdzJzhxgh7shKG93q61eLO1Tug+WBMYB2EpBuYnTB5AIMYhCDwNI8C4uBV6c3brHfcrie7MASNao8TID2QScASKNFFWvjv/i1Ccn5ztxdcQuhSsNjGZp8A="

View File

@@ -11,7 +11,8 @@ Commons Attribution 4.0 International License. You retain the copyright
to code you have written.
When accepting your first contribution, the maintainer of the project
will ensure that you are added to the CONTRIBUTORS file.
will ensure that you are added to the CONTRIBUTORS file. You are welcome
to add yourself as a separate commit in your first pull request.
## Building

View File

@@ -1,5 +1,8 @@
Aaron Bieber <qbit@deftly.net>
Andrew Dunham <andrew@du.nham.ca>
Audrius Butkevicius <audrius.butkevicius@gmail.com>
Arthur Axel fREW Schmidt <frew@afoolishmanifesto.com>
Ben Sidhom <bsidhom@gmail.com>
Brandon Philips <brandon@ifup.org>
James Patterson <jamespatterson@operamail.com>
Jens Diemer <github.com@jensdiemer.de>

49
Godeps/Godeps.json generated
View File

@@ -1,53 +1,56 @@
{
"ImportPath": "github.com/calmh/syncthing",
"GoVersion": "go1.2.2",
"GoVersion": "devel +6bdbf9086c00 Thu Jul 17 17:02:46 2014 +1000",
"Packages": [
"./cmd/syncthing",
"./cmd/assets",
"./discover/cmd/discosrv"
"./cmd/..."
],
"Deps": [
{
"ImportPath": "bitbucket.org/kardianos/osext",
"Comment": "null-9",
"Rev": "364fb577de68fb646c4cb39cc0e09c887ee16376"
"Comment": "null-13",
"Rev": "5d3ddcf53a508cc2f7404eaebf546ef2cb5cdb6e"
},
{
"ImportPath": "code.google.com/p/go.crypto/bcrypt",
"Comment": "null-185",
"Rev": "6478cc9340cbbe6c04511280c5007722269108e9"
"Comment": "null-213",
"Rev": "aa2644fe4aa50e3b38d75187b4799b1f0c9ddcef"
},
{
"ImportPath": "code.google.com/p/go.crypto/blowfish",
"Comment": "null-185",
"Rev": "6478cc9340cbbe6c04511280c5007722269108e9"
"Comment": "null-213",
"Rev": "aa2644fe4aa50e3b38d75187b4799b1f0c9ddcef"
},
{
"ImportPath": "code.google.com/p/go.net/html",
"Comment": "null-137",
"Rev": "335bcad0b70a3ba61b9951992a418c9176a73fd4"
},
{
"ImportPath": "code.google.com/p/go.text/transform",
"Comment": "null-81",
"Rev": "9cbe983aed9b0dfc73954433fead5e00866342ac"
"Comment": "null-88",
"Rev": "1506dcc33592c369c3be7bd30b38f90445b86deb"
},
{
"ImportPath": "code.google.com/p/go.text/unicode/norm",
"Comment": "null-81",
"Rev": "9cbe983aed9b0dfc73954433fead5e00866342ac"
"Comment": "null-88",
"Rev": "1506dcc33592c369c3be7bd30b38f90445b86deb"
},
{
"ImportPath": "github.com/codegangsta/inject",
"Rev": "9aea7a2fa5b79ef7fc00f63a575e72df33b4e886"
},
{
"ImportPath": "github.com/codegangsta/martini",
"Comment": "v0.1-142-g8659df7",
"Rev": "8659df7a51aebe6c6120268cd5a8b4c34fa8441a"
"ImportPath": "code.google.com/p/snappy-go/snappy",
"Comment": "null-15",
"Rev": "12e4b4183793ac4b061921e7980845e750679fd0"
},
{
"ImportPath": "github.com/golang/groupcache/lru",
"Rev": "d781998583680cda80cf61e0b37dd0cd8da2eb52"
"Rev": "8b25adc0f62632c810997cb38c21111a3f256bf4"
},
{
"ImportPath": "github.com/juju/ratelimit",
"Rev": "cbaa435c80a9716e086f25d409344b26c4039358"
"Rev": "f9f36d11773655c0485207f0ad30dc2655f69d56"
},
{
"ImportPath": "github.com/syndtr/goleveldb/leveldb",
"Rev": "ba4481e4cb1d45f586e32be2ab663f173b08b207"
},
{
"ImportPath": "github.com/vitrun/qart/coding",

View File

@@ -4,13 +4,17 @@
package osext
import "syscall"
import (
"syscall"
"os"
"strconv"
)
func executable() (string, error) {
f, err := Open("/proc/" + itoa(Getpid()) + "/text")
if err != nil {
return "", err
}
defer f.Close()
return syscall.Fd2path(int(f.Fd()))
f, err := os.Open("/proc/" + strconv.Itoa(os.Getpid()) + "/text")
if err != nil {
return "", err
}
defer f.Close()
return syscall.Fd2path(int(f.Fd()))
}

View File

@@ -14,7 +14,7 @@ import (
"unsafe"
)
var startUpcwd, getwdError = os.Getwd()
var initCwd, initCwdErr = os.Getwd()
func executable() (string, error) {
var mib [4]int32
@@ -26,20 +26,20 @@ func executable() (string, error) {
}
n := uintptr(0)
// get length
_, _, err := syscall.Syscall6(syscall.SYS___SYSCTL, uintptr(unsafe.Pointer(&mib[0])), 4, 0, uintptr(unsafe.Pointer(&n)), 0, 0)
if err != 0 {
return "", err
// Get length.
_, _, errNum := syscall.Syscall6(syscall.SYS___SYSCTL, uintptr(unsafe.Pointer(&mib[0])), 4, 0, uintptr(unsafe.Pointer(&n)), 0, 0)
if errNum != 0 {
return "", errNum
}
if n == 0 { // shouldn't happen
if n == 0 { // This shouldn't happen.
return "", nil
}
buf := make([]byte, n)
_, _, err = syscall.Syscall6(syscall.SYS___SYSCTL, uintptr(unsafe.Pointer(&mib[0])), 4, uintptr(unsafe.Pointer(&buf[0])), uintptr(unsafe.Pointer(&n)), 0, 0)
if err != 0 {
return "", err
_, _, errNum = syscall.Syscall6(syscall.SYS___SYSCTL, uintptr(unsafe.Pointer(&mib[0])), 4, uintptr(unsafe.Pointer(&buf[0])), uintptr(unsafe.Pointer(&n)), 0, 0)
if errNum != 0 {
return "", errNum
}
if n == 0 { // shouldn't happen
if n == 0 { // This shouldn't happen.
return "", nil
}
for i, v := range buf {
@@ -48,35 +48,32 @@ func executable() (string, error) {
break
}
}
var strpath string
if buf[0] != '/' {
var e error
if strpath, e = getAbs(buf); e != nil {
return strpath, e
var err error
execPath := string(buf)
// execPath will not be empty due to above checks.
// Try to get the absolute path if the execPath is not rooted.
if execPath[0] != '/' {
execPath, err = getAbs(execPath)
if err != nil {
return execPath, err
}
} else {
strpath = string(buf)
}
// darwin KERN_PROCARGS may return the path to a symlink rather than the
// actual executable
// For darwin KERN_PROCARGS may return the path to a symlink rather than the
// actual executable.
if runtime.GOOS == "darwin" {
if strpath, err := filepath.EvalSymlinks(strpath); err != nil {
return strpath, err
if execPath, err = filepath.EvalSymlinks(execPath); err != nil {
return execPath, err
}
}
return strpath, nil
return execPath, nil
}
func getAbs(buf []byte) (string, error) {
if getwdError != nil {
return string(buf), getwdError
} else {
if buf[0] == '.' {
buf = buf[1:]
}
if startUpcwd[len(startUpcwd)-1] != '/' && buf[0] != '/' {
return startUpcwd + "/" + string(buf), nil
}
return startUpcwd + string(buf), nil
func getAbs(execPath string) (string, error) {
if initCwdErr != nil {
return execPath, initCwdErr
}
// The execPath may begin with a "../" or a "./" so clean it first.
// Join the two paths, trailing and starting slashes undetermined, so use
// the generic Join function.
return filepath.Join(initCwd, filepath.Clean(execPath)), nil
}

View File

@@ -53,6 +53,15 @@ func TestBcryptingIsCorrect(t *testing.T) {
}
}
func TestVeryShortPasswords(t *testing.T) {
key := []byte("k")
salt := []byte("XajjQvNhvvRt5GSeFk1xFe")
_, err := bcrypt(key, 10, salt)
if err != nil {
t.Errorf("One byte key resulted in error: %s", err)
}
}
func TestTooLongPasswordsWork(t *testing.T) {
salt := []byte("XajjQvNhvvRt5GSeFk1xFe")
// One byte over the usual 56 byte limit that blowfish has

View File

@@ -192,19 +192,13 @@ func TestCipherDecrypt(t *testing.T) {
}
func TestSaltedCipherKeyLength(t *testing.T) {
var key []byte
for i := 0; i < 4; i++ {
_, err := NewSaltedCipher(key, []byte{'a'})
if err != KeySizeError(i) {
t.Errorf("NewSaltedCipher with short key, gave error %#v, expected %#v", err, KeySizeError(i))
}
key = append(key, 'a')
if _, err := NewSaltedCipher(nil, []byte{'a'}); err != KeySizeError(0) {
t.Errorf("NewSaltedCipher with short key, gave error %#v, expected %#v", err, KeySizeError(0))
}
// A 57-byte key. One over the typical blowfish restriction.
key = []byte("012345678901234567890123456789012345678901234567890123456")
_, err := NewSaltedCipher(key, []byte{'a'})
if err != nil {
key := []byte("012345678901234567890123456789012345678901234567890123456")
if _, err := NewSaltedCipher(key, []byte{'a'}); err != nil {
t.Errorf("NewSaltedCipher with long key, gave error %#v", err)
}
}

View File

@@ -26,14 +26,13 @@ func (k KeySizeError) Error() string {
}
// NewCipher creates and returns a Cipher.
// The key argument should be the Blowfish key, 4 to 56 bytes.
// The key argument should be the Blowfish key, from 1 to 56 bytes.
func NewCipher(key []byte) (*Cipher, error) {
var result Cipher
k := len(key)
if k < 4 || k > 56 {
if k := len(key); k < 1 || k > 56 {
return nil, KeySizeError(k)
}
initCipher(key, &result)
initCipher(&result)
ExpandKey(key, &result)
return &result, nil
}
@@ -44,11 +43,10 @@ func NewCipher(key []byte) (*Cipher, error) {
// bytes. Only the first 16 bytes of salt are used.
func NewSaltedCipher(key, salt []byte) (*Cipher, error) {
var result Cipher
k := len(key)
if k < 4 {
if k := len(key); k < 1 {
return nil, KeySizeError(k)
}
initCipher(key, &result)
initCipher(&result)
expandKeyWithSalt(key, salt, &result)
return &result, nil
}
@@ -81,7 +79,7 @@ func (c *Cipher) Decrypt(dst, src []byte) {
dst[4], dst[5], dst[6], dst[7] = byte(r>>24), byte(r>>16), byte(r>>8), byte(r)
}
func initCipher(key []byte, c *Cipher) {
func initCipher(c *Cipher) {
copy(c.p[0:], p[0:])
copy(c.s0[0:], s0[0:])
copy(c.s1[0:], s1[0:])

View File

@@ -0,0 +1,78 @@
// Copyright 2012 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Package atom provides integer codes (also known as atoms) for a fixed set of
// frequently occurring HTML strings: tag names and attribute keys such as "p"
// and "id".
//
// Sharing an atom's name between all elements with the same tag can result in
// fewer string allocations when tokenizing and parsing HTML. Integer
// comparisons are also generally faster than string comparisons.
//
// The value of an atom's particular code is not guaranteed to stay the same
// between versions of this package. Neither is any ordering guaranteed:
// whether atom.H1 < atom.H2 may also change. The codes are not guaranteed to
// be dense. The only guarantees are that e.g. looking up "div" will yield
// atom.Div, calling atom.Div.String will return "div", and atom.Div != 0.
package atom
// Atom is an integer code for a string. The zero value maps to "".
type Atom uint32
// String returns the atom's name.
func (a Atom) String() string {
start := uint32(a >> 8)
n := uint32(a & 0xff)
if start+n > uint32(len(atomText)) {
return ""
}
return atomText[start : start+n]
}
func (a Atom) string() string {
return atomText[a>>8 : a>>8+a&0xff]
}
// fnv computes the FNV hash with an arbitrary starting value h.
func fnv(h uint32, s []byte) uint32 {
for i := range s {
h ^= uint32(s[i])
h *= 16777619
}
return h
}
func match(s string, t []byte) bool {
for i, c := range t {
if s[i] != c {
return false
}
}
return true
}
// Lookup returns the atom whose name is s. It returns zero if there is no
// such atom. The lookup is case sensitive.
func Lookup(s []byte) Atom {
if len(s) == 0 || len(s) > maxAtomLen {
return 0
}
h := fnv(hash0, s)
if a := table[h&uint32(len(table)-1)]; int(a&0xff) == len(s) && match(a.string(), s) {
return a
}
if a := table[(h>>16)&uint32(len(table)-1)]; int(a&0xff) == len(s) && match(a.string(), s) {
return a
}
return 0
}
// String returns a string whose contents are equal to s. In that sense, it is
// equivalent to string(s) but may be more efficient.
func String(s []byte) string {
if a := Lookup(s); a != 0 {
return a.String()
}
return string(s)
}

View File

@@ -0,0 +1,109 @@
// Copyright 2012 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package atom
import (
"sort"
"testing"
)
func TestKnown(t *testing.T) {
for _, s := range testAtomList {
if atom := Lookup([]byte(s)); atom.String() != s {
t.Errorf("Lookup(%q) = %#x (%q)", s, uint32(atom), atom.String())
}
}
}
func TestHits(t *testing.T) {
for _, a := range table {
if a == 0 {
continue
}
got := Lookup([]byte(a.String()))
if got != a {
t.Errorf("Lookup(%q) = %#x, want %#x", a.String(), uint32(got), uint32(a))
}
}
}
func TestMisses(t *testing.T) {
testCases := []string{
"",
"\x00",
"\xff",
"A",
"DIV",
"Div",
"dIV",
"aa",
"a\x00",
"ab",
"abb",
"abbr0",
"abbr ",
" abbr",
" a",
"acceptcharset",
"acceptCharset",
"accept_charset",
"h0",
"h1h2",
"h7",
"onClick",
"λ",
// The following string has the same hash (0xa1d7fab7) as "onmouseover".
"\x00\x00\x00\x00\x00\x50\x18\xae\x38\xd0\xb7",
}
for _, tc := range testCases {
got := Lookup([]byte(tc))
if got != 0 {
t.Errorf("Lookup(%q): got %d, want 0", tc, got)
}
}
}
func TestForeignObject(t *testing.T) {
const (
afo = Foreignobject
afO = ForeignObject
sfo = "foreignobject"
sfO = "foreignObject"
)
if got := Lookup([]byte(sfo)); got != afo {
t.Errorf("Lookup(%q): got %#v, want %#v", sfo, got, afo)
}
if got := Lookup([]byte(sfO)); got != afO {
t.Errorf("Lookup(%q): got %#v, want %#v", sfO, got, afO)
}
if got := afo.String(); got != sfo {
t.Errorf("Atom(%#v).String(): got %q, want %q", afo, got, sfo)
}
if got := afO.String(); got != sfO {
t.Errorf("Atom(%#v).String(): got %q, want %q", afO, got, sfO)
}
}
func BenchmarkLookup(b *testing.B) {
sortedTable := make([]string, 0, len(table))
for _, a := range table {
if a != 0 {
sortedTable = append(sortedTable, a.String())
}
}
sort.Strings(sortedTable)
x := make([][]byte, 1000)
for i := range x {
x[i] = []byte(sortedTable[i%len(sortedTable)])
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
for _, s := range x {
Lookup(s)
}
}
}

View File

@@ -0,0 +1,636 @@
// Copyright 2012 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build ignore
package main
// This program generates table.go and table_test.go.
// Invoke as
//
// go run gen.go |gofmt >table.go
// go run gen.go -test |gofmt >table_test.go
import (
"flag"
"fmt"
"math/rand"
"os"
"sort"
"strings"
)
// identifier converts s to a Go exported identifier.
// It converts "div" to "Div" and "accept-charset" to "AcceptCharset".
func identifier(s string) string {
b := make([]byte, 0, len(s))
cap := true
for _, c := range s {
if c == '-' {
cap = true
continue
}
if cap && 'a' <= c && c <= 'z' {
c -= 'a' - 'A'
}
cap = false
b = append(b, byte(c))
}
return string(b)
}
var test = flag.Bool("test", false, "generate table_test.go")
func main() {
flag.Parse()
var all []string
all = append(all, elements...)
all = append(all, attributes...)
all = append(all, eventHandlers...)
all = append(all, extra...)
sort.Strings(all)
if *test {
fmt.Printf("// generated by go run gen.go -test; DO NOT EDIT\n\n")
fmt.Printf("package atom\n\n")
fmt.Printf("var testAtomList = []string{\n")
for _, s := range all {
fmt.Printf("\t%q,\n", s)
}
fmt.Printf("}\n")
return
}
// uniq - lists have dups
// compute max len too
maxLen := 0
w := 0
for _, s := range all {
if w == 0 || all[w-1] != s {
if maxLen < len(s) {
maxLen = len(s)
}
all[w] = s
w++
}
}
all = all[:w]
// Find hash that minimizes table size.
var best *table
for i := 0; i < 1000000; i++ {
if best != nil && 1<<(best.k-1) < len(all) {
break
}
h := rand.Uint32()
for k := uint(0); k <= 16; k++ {
if best != nil && k >= best.k {
break
}
var t table
if t.init(h, k, all) {
best = &t
break
}
}
}
if best == nil {
fmt.Fprintf(os.Stderr, "failed to construct string table\n")
os.Exit(1)
}
// Lay out strings, using overlaps when possible.
layout := append([]string{}, all...)
// Remove strings that are substrings of other strings
for changed := true; changed; {
changed = false
for i, s := range layout {
if s == "" {
continue
}
for j, t := range layout {
if i != j && t != "" && strings.Contains(s, t) {
changed = true
layout[j] = ""
}
}
}
}
// Join strings where one suffix matches another prefix.
for {
// Find best i, j, k such that layout[i][len-k:] == layout[j][:k],
// maximizing overlap length k.
besti := -1
bestj := -1
bestk := 0
for i, s := range layout {
if s == "" {
continue
}
for j, t := range layout {
if i == j {
continue
}
for k := bestk + 1; k <= len(s) && k <= len(t); k++ {
if s[len(s)-k:] == t[:k] {
besti = i
bestj = j
bestk = k
}
}
}
}
if bestk > 0 {
layout[besti] += layout[bestj][bestk:]
layout[bestj] = ""
continue
}
break
}
text := strings.Join(layout, "")
atom := map[string]uint32{}
for _, s := range all {
off := strings.Index(text, s)
if off < 0 {
panic("lost string " + s)
}
atom[s] = uint32(off<<8 | len(s))
}
// Generate the Go code.
fmt.Printf("// generated by go run gen.go; DO NOT EDIT\n\n")
fmt.Printf("package atom\n\nconst (\n")
for _, s := range all {
fmt.Printf("\t%s Atom = %#x\n", identifier(s), atom[s])
}
fmt.Printf(")\n\n")
fmt.Printf("const hash0 = %#x\n\n", best.h0)
fmt.Printf("const maxAtomLen = %d\n\n", maxLen)
fmt.Printf("var table = [1<<%d]Atom{\n", best.k)
for i, s := range best.tab {
if s == "" {
continue
}
fmt.Printf("\t%#x: %#x, // %s\n", i, atom[s], s)
}
fmt.Printf("}\n")
datasize := (1 << best.k) * 4
fmt.Printf("const atomText =\n")
textsize := len(text)
for len(text) > 60 {
fmt.Printf("\t%q +\n", text[:60])
text = text[60:]
}
fmt.Printf("\t%q\n\n", text)
fmt.Fprintf(os.Stderr, "%d atoms; %d string bytes + %d tables = %d total data\n", len(all), textsize, datasize, textsize+datasize)
}
type byLen []string
func (x byLen) Less(i, j int) bool { return len(x[i]) > len(x[j]) }
func (x byLen) Swap(i, j int) { x[i], x[j] = x[j], x[i] }
func (x byLen) Len() int { return len(x) }
// fnv computes the FNV hash with an arbitrary starting value h.
func fnv(h uint32, s string) uint32 {
for i := 0; i < len(s); i++ {
h ^= uint32(s[i])
h *= 16777619
}
return h
}
// A table represents an attempt at constructing the lookup table.
// The lookup table uses cuckoo hashing, meaning that each string
// can be found in one of two positions.
type table struct {
h0 uint32
k uint
mask uint32
tab []string
}
// hash returns the two hashes for s.
func (t *table) hash(s string) (h1, h2 uint32) {
h := fnv(t.h0, s)
h1 = h & t.mask
h2 = (h >> 16) & t.mask
return
}
// init initializes the table with the given parameters.
// h0 is the initial hash value,
// k is the number of bits of hash value to use, and
// x is the list of strings to store in the table.
// init returns false if the table cannot be constructed.
func (t *table) init(h0 uint32, k uint, x []string) bool {
t.h0 = h0
t.k = k
t.tab = make([]string, 1<<k)
t.mask = 1<<k - 1
for _, s := range x {
if !t.insert(s) {
return false
}
}
return true
}
// insert inserts s in the table.
func (t *table) insert(s string) bool {
h1, h2 := t.hash(s)
if t.tab[h1] == "" {
t.tab[h1] = s
return true
}
if t.tab[h2] == "" {
t.tab[h2] = s
return true
}
if t.push(h1, 0) {
t.tab[h1] = s
return true
}
if t.push(h2, 0) {
t.tab[h2] = s
return true
}
return false
}
// push attempts to push aside the entry in slot i.
func (t *table) push(i uint32, depth int) bool {
if depth > len(t.tab) {
return false
}
s := t.tab[i]
h1, h2 := t.hash(s)
j := h1 + h2 - i
if t.tab[j] != "" && !t.push(j, depth+1) {
return false
}
t.tab[j] = s
return true
}
// The lists of element names and attribute keys were taken from
// http://www.whatwg.org/specs/web-apps/current-work/multipage/section-index.html
// as of the "HTML Living Standard - Last Updated 30 May 2012" version.
var elements = []string{
"a",
"abbr",
"address",
"area",
"article",
"aside",
"audio",
"b",
"base",
"bdi",
"bdo",
"blockquote",
"body",
"br",
"button",
"canvas",
"caption",
"cite",
"code",
"col",
"colgroup",
"command",
"data",
"datalist",
"dd",
"del",
"details",
"dfn",
"dialog",
"div",
"dl",
"dt",
"em",
"embed",
"fieldset",
"figcaption",
"figure",
"footer",
"form",
"h1",
"h2",
"h3",
"h4",
"h5",
"h6",
"head",
"header",
"hgroup",
"hr",
"html",
"i",
"iframe",
"img",
"input",
"ins",
"kbd",
"keygen",
"label",
"legend",
"li",
"link",
"map",
"mark",
"menu",
"meta",
"meter",
"nav",
"noscript",
"object",
"ol",
"optgroup",
"option",
"output",
"p",
"param",
"pre",
"progress",
"q",
"rp",
"rt",
"ruby",
"s",
"samp",
"script",
"section",
"select",
"small",
"source",
"span",
"strong",
"style",
"sub",
"summary",
"sup",
"table",
"tbody",
"td",
"textarea",
"tfoot",
"th",
"thead",
"time",
"title",
"tr",
"track",
"u",
"ul",
"var",
"video",
"wbr",
}
var attributes = []string{
"accept",
"accept-charset",
"accesskey",
"action",
"alt",
"async",
"autocomplete",
"autofocus",
"autoplay",
"border",
"challenge",
"charset",
"checked",
"cite",
"class",
"cols",
"colspan",
"command",
"content",
"contenteditable",
"contextmenu",
"controls",
"coords",
"crossorigin",
"data",
"datetime",
"default",
"defer",
"dir",
"dirname",
"disabled",
"download",
"draggable",
"dropzone",
"enctype",
"for",
"form",
"formaction",
"formenctype",
"formmethod",
"formnovalidate",
"formtarget",
"headers",
"height",
"hidden",
"high",
"href",
"hreflang",
"http-equiv",
"icon",
"id",
"inert",
"ismap",
"itemid",
"itemprop",
"itemref",
"itemscope",
"itemtype",
"keytype",
"kind",
"label",
"lang",
"list",
"loop",
"low",
"manifest",
"max",
"maxlength",
"media",
"mediagroup",
"method",
"min",
"multiple",
"muted",
"name",
"novalidate",
"open",
"optimum",
"pattern",
"ping",
"placeholder",
"poster",
"preload",
"radiogroup",
"readonly",
"rel",
"required",
"reversed",
"rows",
"rowspan",
"sandbox",
"spellcheck",
"scope",
"scoped",
"seamless",
"selected",
"shape",
"size",
"sizes",
"span",
"src",
"srcdoc",
"srclang",
"start",
"step",
"style",
"tabindex",
"target",
"title",
"translate",
"type",
"typemustmatch",
"usemap",
"value",
"width",
"wrap",
}
var eventHandlers = []string{
"onabort",
"onafterprint",
"onbeforeprint",
"onbeforeunload",
"onblur",
"oncancel",
"oncanplay",
"oncanplaythrough",
"onchange",
"onclick",
"onclose",
"oncontextmenu",
"oncuechange",
"ondblclick",
"ondrag",
"ondragend",
"ondragenter",
"ondragleave",
"ondragover",
"ondragstart",
"ondrop",
"ondurationchange",
"onemptied",
"onended",
"onerror",
"onfocus",
"onhashchange",
"oninput",
"oninvalid",
"onkeydown",
"onkeypress",
"onkeyup",
"onload",
"onloadeddata",
"onloadedmetadata",
"onloadstart",
"onmessage",
"onmousedown",
"onmousemove",
"onmouseout",
"onmouseover",
"onmouseup",
"onmousewheel",
"onoffline",
"ononline",
"onpagehide",
"onpageshow",
"onpause",
"onplay",
"onplaying",
"onpopstate",
"onprogress",
"onratechange",
"onreset",
"onresize",
"onscroll",
"onseeked",
"onseeking",
"onselect",
"onshow",
"onstalled",
"onstorage",
"onsubmit",
"onsuspend",
"ontimeupdate",
"onunload",
"onvolumechange",
"onwaiting",
}
// extra are ad-hoc values not covered by any of the lists above.
var extra = []string{
"align",
"annotation",
"annotation-xml",
"applet",
"basefont",
"bgsound",
"big",
"blink",
"center",
"color",
"desc",
"face",
"font",
"foreignObject", // HTML is case-insensitive, but SVG-embedded-in-HTML is case-sensitive.
"foreignobject",
"frame",
"frameset",
"image",
"isindex",
"listing",
"malignmark",
"marquee",
"math",
"mglyph",
"mi",
"mn",
"mo",
"ms",
"mtext",
"nobr",
"noembed",
"noframes",
"plaintext",
"prompt",
"public",
"spacer",
"strike",
"svg",
"system",
"tt",
"xmp",
}

View File

@@ -0,0 +1,694 @@
// generated by go run gen.go; DO NOT EDIT
package atom
const (
A Atom = 0x1
Abbr Atom = 0x4
Accept Atom = 0x2106
AcceptCharset Atom = 0x210e
Accesskey Atom = 0x3309
Action Atom = 0x21b06
Address Atom = 0x5d507
Align Atom = 0x1105
Alt Atom = 0x4503
Annotation Atom = 0x18d0a
AnnotationXml Atom = 0x18d0e
Applet Atom = 0x2d106
Area Atom = 0x31804
Article Atom = 0x39907
Aside Atom = 0x4f05
Async Atom = 0x9305
Audio Atom = 0xaf05
Autocomplete Atom = 0xd50c
Autofocus Atom = 0xe109
Autoplay Atom = 0x10c08
B Atom = 0x101
Base Atom = 0x11404
Basefont Atom = 0x11408
Bdi Atom = 0x1a03
Bdo Atom = 0x12503
Bgsound Atom = 0x13807
Big Atom = 0x14403
Blink Atom = 0x14705
Blockquote Atom = 0x14c0a
Body Atom = 0x2f04
Border Atom = 0x15606
Br Atom = 0x202
Button Atom = 0x15c06
Canvas Atom = 0x4b06
Caption Atom = 0x1e007
Center Atom = 0x2df06
Challenge Atom = 0x23e09
Charset Atom = 0x2807
Checked Atom = 0x33f07
Cite Atom = 0x9704
Class Atom = 0x3d905
Code Atom = 0x16f04
Col Atom = 0x17603
Colgroup Atom = 0x17608
Color Atom = 0x18305
Cols Atom = 0x18804
Colspan Atom = 0x18807
Command Atom = 0x19b07
Content Atom = 0x42c07
Contenteditable Atom = 0x42c0f
Contextmenu Atom = 0x3480b
Controls Atom = 0x1ae08
Coords Atom = 0x1ba06
Crossorigin Atom = 0x1c40b
Data Atom = 0x44304
Datalist Atom = 0x44308
Datetime Atom = 0x25b08
Dd Atom = 0x28802
Default Atom = 0x5207
Defer Atom = 0x17105
Del Atom = 0x4d603
Desc Atom = 0x4804
Details Atom = 0x6507
Dfn Atom = 0x8303
Dialog Atom = 0x1b06
Dir Atom = 0x9d03
Dirname Atom = 0x9d07
Disabled Atom = 0x10008
Div Atom = 0x10703
Dl Atom = 0x13e02
Download Atom = 0x40908
Draggable Atom = 0x1a109
Dropzone Atom = 0x3a208
Dt Atom = 0x4e402
Em Atom = 0x7f02
Embed Atom = 0x7f05
Enctype Atom = 0x23007
Face Atom = 0x2dd04
Fieldset Atom = 0x1d508
Figcaption Atom = 0x1dd0a
Figure Atom = 0x1f106
Font Atom = 0x11804
Footer Atom = 0x5906
For Atom = 0x1fd03
ForeignObject Atom = 0x1fd0d
Foreignobject Atom = 0x20a0d
Form Atom = 0x21704
Formaction Atom = 0x2170a
Formenctype Atom = 0x22c0b
Formmethod Atom = 0x2470a
Formnovalidate Atom = 0x2510e
Formtarget Atom = 0x2660a
Frame Atom = 0x8705
Frameset Atom = 0x8708
H1 Atom = 0x13602
H2 Atom = 0x29602
H3 Atom = 0x2c502
H4 Atom = 0x30e02
H5 Atom = 0x4e602
H6 Atom = 0x27002
Head Atom = 0x2fa04
Header Atom = 0x2fa06
Headers Atom = 0x2fa07
Height Atom = 0x27206
Hgroup Atom = 0x27a06
Hidden Atom = 0x28606
High Atom = 0x29304
Hr Atom = 0x13102
Href Atom = 0x29804
Hreflang Atom = 0x29808
Html Atom = 0x27604
HttpEquiv Atom = 0x2a00a
I Atom = 0x601
Icon Atom = 0x42b04
Id Atom = 0x5102
Iframe Atom = 0x2b406
Image Atom = 0x2ba05
Img Atom = 0x2bf03
Inert Atom = 0x4c105
Input Atom = 0x3f605
Ins Atom = 0x1cd03
Isindex Atom = 0x2c707
Ismap Atom = 0x2ce05
Itemid Atom = 0x9806
Itemprop Atom = 0x57e08
Itemref Atom = 0x2d707
Itemscope Atom = 0x2e509
Itemtype Atom = 0x2ef08
Kbd Atom = 0x1903
Keygen Atom = 0x3906
Keytype Atom = 0x51207
Kind Atom = 0xfd04
Label Atom = 0xba05
Lang Atom = 0x29c04
Legend Atom = 0x1a806
Li Atom = 0x1202
Link Atom = 0x14804
List Atom = 0x44704
Listing Atom = 0x44707
Loop Atom = 0xbe04
Low Atom = 0x13f03
Malignmark Atom = 0x100a
Manifest Atom = 0x5b608
Map Atom = 0x2d003
Mark Atom = 0x1604
Marquee Atom = 0x5f207
Math Atom = 0x2f704
Max Atom = 0x30603
Maxlength Atom = 0x30609
Media Atom = 0xa205
Mediagroup Atom = 0xa20a
Menu Atom = 0x34f04
Meta Atom = 0x45604
Meter Atom = 0x26105
Method Atom = 0x24b06
Mglyph Atom = 0x2c006
Mi Atom = 0x9b02
Min Atom = 0x31003
Mn Atom = 0x25402
Mo Atom = 0x47a02
Ms Atom = 0x2e802
Mtext Atom = 0x31305
Multiple Atom = 0x32108
Muted Atom = 0x32905
Name Atom = 0xa004
Nav Atom = 0x3e03
Nobr Atom = 0x7404
Noembed Atom = 0x7d07
Noframes Atom = 0x8508
Noscript Atom = 0x28b08
Novalidate Atom = 0x2550a
Object Atom = 0x21106
Ol Atom = 0xcd02
Onabort Atom = 0x16007
Onafterprint Atom = 0x1e50c
Onbeforeprint Atom = 0x21f0d
Onbeforeunload Atom = 0x5c90e
Onblur Atom = 0x3e206
Oncancel Atom = 0xb308
Oncanplay Atom = 0x12709
Oncanplaythrough Atom = 0x12710
Onchange Atom = 0x3b808
Onclick Atom = 0x2ad07
Onclose Atom = 0x32e07
Oncontextmenu Atom = 0x3460d
Oncuechange Atom = 0x3530b
Ondblclick Atom = 0x35e0a
Ondrag Atom = 0x36806
Ondragend Atom = 0x36809
Ondragenter Atom = 0x3710b
Ondragleave Atom = 0x37c0b
Ondragover Atom = 0x3870a
Ondragstart Atom = 0x3910b
Ondrop Atom = 0x3a006
Ondurationchange Atom = 0x3b010
Onemptied Atom = 0x3a709
Onended Atom = 0x3c007
Onerror Atom = 0x3c707
Onfocus Atom = 0x3ce07
Onhashchange Atom = 0x3e80c
Oninput Atom = 0x3f407
Oninvalid Atom = 0x3fb09
Onkeydown Atom = 0x40409
Onkeypress Atom = 0x4110a
Onkeyup Atom = 0x42107
Onload Atom = 0x43b06
Onloadeddata Atom = 0x43b0c
Onloadedmetadata Atom = 0x44e10
Onloadstart Atom = 0x4640b
Onmessage Atom = 0x46f09
Onmousedown Atom = 0x4780b
Onmousemove Atom = 0x4830b
Onmouseout Atom = 0x48e0a
Onmouseover Atom = 0x49b0b
Onmouseup Atom = 0x4a609
Onmousewheel Atom = 0x4af0c
Onoffline Atom = 0x4bb09
Ononline Atom = 0x4c608
Onpagehide Atom = 0x4ce0a
Onpageshow Atom = 0x4d90a
Onpause Atom = 0x4e807
Onplay Atom = 0x4f206
Onplaying Atom = 0x4f209
Onpopstate Atom = 0x4fb0a
Onprogress Atom = 0x5050a
Onratechange Atom = 0x5190c
Onreset Atom = 0x52507
Onresize Atom = 0x52c08
Onscroll Atom = 0x53a08
Onseeked Atom = 0x54208
Onseeking Atom = 0x54a09
Onselect Atom = 0x55308
Onshow Atom = 0x55d06
Onstalled Atom = 0x56609
Onstorage Atom = 0x56f09
Onsubmit Atom = 0x57808
Onsuspend Atom = 0x58809
Ontimeupdate Atom = 0x1190c
Onunload Atom = 0x59108
Onvolumechange Atom = 0x5990e
Onwaiting Atom = 0x5a709
Open Atom = 0x58404
Optgroup Atom = 0xc008
Optimum Atom = 0x5b007
Option Atom = 0x5c506
Output Atom = 0x49506
P Atom = 0xc01
Param Atom = 0xc05
Pattern Atom = 0x6e07
Ping Atom = 0xab04
Placeholder Atom = 0xc70b
Plaintext Atom = 0xf109
Poster Atom = 0x17d06
Pre Atom = 0x27f03
Preload Atom = 0x27f07
Progress Atom = 0x50708
Prompt Atom = 0x5bf06
Public Atom = 0x42706
Q Atom = 0x15101
Radiogroup Atom = 0x30a
Readonly Atom = 0x31908
Rel Atom = 0x28003
Required Atom = 0x1f508
Reversed Atom = 0x5e08
Rows Atom = 0x7704
Rowspan Atom = 0x7707
Rp Atom = 0x1eb02
Rt Atom = 0x16502
Ruby Atom = 0xd104
S Atom = 0x2c01
Samp Atom = 0x6b04
Sandbox Atom = 0xe907
Scope Atom = 0x2e905
Scoped Atom = 0x2e906
Script Atom = 0x28d06
Seamless Atom = 0x33308
Section Atom = 0x3dd07
Select Atom = 0x55506
Selected Atom = 0x55508
Shape Atom = 0x1b505
Size Atom = 0x53004
Sizes Atom = 0x53005
Small Atom = 0x1bf05
Source Atom = 0x1cf06
Spacer Atom = 0x30006
Span Atom = 0x7a04
Spellcheck Atom = 0x33a0a
Src Atom = 0x3d403
Srcdoc Atom = 0x3d406
Srclang Atom = 0x41a07
Start Atom = 0x39705
Step Atom = 0x5bc04
Strike Atom = 0x50e06
Strong Atom = 0x53406
Style Atom = 0x5db05
Sub Atom = 0x57a03
Summary Atom = 0x5e007
Sup Atom = 0x5e703
Svg Atom = 0x5ea03
System Atom = 0x5ed06
Tabindex Atom = 0x45c08
Table Atom = 0x43605
Target Atom = 0x26a06
Tbody Atom = 0x2e05
Td Atom = 0x4702
Textarea Atom = 0x31408
Tfoot Atom = 0x5805
Th Atom = 0x13002
Thead Atom = 0x2f905
Time Atom = 0x11b04
Title Atom = 0x8e05
Tr Atom = 0xf902
Track Atom = 0xf905
Translate Atom = 0x16609
Tt Atom = 0x7002
Type Atom = 0x23304
Typemustmatch Atom = 0x2330d
U Atom = 0xb01
Ul Atom = 0x5602
Usemap Atom = 0x4ec06
Value Atom = 0x4005
Var Atom = 0x10903
Video Atom = 0x2a905
Wbr Atom = 0x14103
Width Atom = 0x4e205
Wrap Atom = 0x56204
Xmp Atom = 0xef03
)
const hash0 = 0xc17da63e
const maxAtomLen = 16
var table = [1 << 9]Atom{
0x1: 0x4830b, // onmousemove
0x2: 0x5a709, // onwaiting
0x4: 0x5bf06, // prompt
0x7: 0x5b007, // optimum
0x8: 0x1604, // mark
0xa: 0x2d707, // itemref
0xb: 0x4d90a, // onpageshow
0xc: 0x55506, // select
0xd: 0x1a109, // draggable
0xe: 0x3e03, // nav
0xf: 0x19b07, // command
0x11: 0xb01, // u
0x14: 0x2fa07, // headers
0x15: 0x44308, // datalist
0x17: 0x6b04, // samp
0x1a: 0x40409, // onkeydown
0x1b: 0x53a08, // onscroll
0x1c: 0x17603, // col
0x20: 0x57e08, // itemprop
0x21: 0x2a00a, // http-equiv
0x22: 0x5e703, // sup
0x24: 0x1f508, // required
0x2b: 0x27f07, // preload
0x2c: 0x21f0d, // onbeforeprint
0x2d: 0x3710b, // ondragenter
0x2e: 0x4e402, // dt
0x2f: 0x57808, // onsubmit
0x30: 0x13102, // hr
0x31: 0x3460d, // oncontextmenu
0x33: 0x2ba05, // image
0x34: 0x4e807, // onpause
0x35: 0x27a06, // hgroup
0x36: 0xab04, // ping
0x37: 0x55308, // onselect
0x3a: 0x10703, // div
0x40: 0x9b02, // mi
0x41: 0x33308, // seamless
0x42: 0x2807, // charset
0x43: 0x5102, // id
0x44: 0x4fb0a, // onpopstate
0x45: 0x4d603, // del
0x46: 0x5f207, // marquee
0x47: 0x3309, // accesskey
0x49: 0x5906, // footer
0x4a: 0x2d106, // applet
0x4b: 0x2ce05, // ismap
0x51: 0x34f04, // menu
0x52: 0x2f04, // body
0x55: 0x8708, // frameset
0x56: 0x52507, // onreset
0x57: 0x14705, // blink
0x58: 0x8e05, // title
0x59: 0x39907, // article
0x5b: 0x13002, // th
0x5d: 0x15101, // q
0x5e: 0x58404, // open
0x5f: 0x31804, // area
0x61: 0x43b06, // onload
0x62: 0x3f605, // input
0x63: 0x11404, // base
0x64: 0x18807, // colspan
0x65: 0x51207, // keytype
0x66: 0x13e02, // dl
0x68: 0x1d508, // fieldset
0x6a: 0x31003, // min
0x6b: 0x10903, // var
0x6f: 0x2fa06, // header
0x70: 0x16502, // rt
0x71: 0x17608, // colgroup
0x72: 0x25402, // mn
0x74: 0x16007, // onabort
0x75: 0x3906, // keygen
0x76: 0x4bb09, // onoffline
0x77: 0x23e09, // challenge
0x78: 0x2d003, // map
0x7a: 0x30e02, // h4
0x7b: 0x3c707, // onerror
0x7c: 0x30609, // maxlength
0x7d: 0x31305, // mtext
0x7e: 0x5805, // tfoot
0x7f: 0x11804, // font
0x80: 0x100a, // malignmark
0x81: 0x45604, // meta
0x82: 0x9305, // async
0x83: 0x2c502, // h3
0x84: 0x28802, // dd
0x85: 0x29804, // href
0x86: 0xa20a, // mediagroup
0x87: 0x1ba06, // coords
0x88: 0x41a07, // srclang
0x89: 0x35e0a, // ondblclick
0x8a: 0x4005, // value
0x8c: 0xb308, // oncancel
0x8e: 0x33a0a, // spellcheck
0x8f: 0x8705, // frame
0x91: 0x14403, // big
0x94: 0x21b06, // action
0x95: 0x9d03, // dir
0x97: 0x31908, // readonly
0x99: 0x43605, // table
0x9a: 0x5e007, // summary
0x9b: 0x14103, // wbr
0x9c: 0x30a, // radiogroup
0x9d: 0xa004, // name
0x9f: 0x5ed06, // system
0xa1: 0x18305, // color
0xa2: 0x4b06, // canvas
0xa3: 0x27604, // html
0xa5: 0x54a09, // onseeking
0xac: 0x1b505, // shape
0xad: 0x28003, // rel
0xae: 0x12710, // oncanplaythrough
0xaf: 0x3870a, // ondragover
0xb1: 0x1fd0d, // foreignObject
0xb3: 0x7704, // rows
0xb6: 0x44707, // listing
0xb7: 0x49506, // output
0xb9: 0x3480b, // contextmenu
0xbb: 0x13f03, // low
0xbc: 0x1eb02, // rp
0xbd: 0x58809, // onsuspend
0xbe: 0x15c06, // button
0xbf: 0x4804, // desc
0xc1: 0x3dd07, // section
0xc2: 0x5050a, // onprogress
0xc3: 0x56f09, // onstorage
0xc4: 0x2f704, // math
0xc5: 0x4f206, // onplay
0xc7: 0x5602, // ul
0xc8: 0x6e07, // pattern
0xc9: 0x4af0c, // onmousewheel
0xca: 0x36809, // ondragend
0xcb: 0xd104, // ruby
0xcc: 0xc01, // p
0xcd: 0x32e07, // onclose
0xce: 0x26105, // meter
0xcf: 0x13807, // bgsound
0xd2: 0x27206, // height
0xd4: 0x101, // b
0xd5: 0x2ef08, // itemtype
0xd8: 0x1e007, // caption
0xd9: 0x10008, // disabled
0xdc: 0x5ea03, // svg
0xdd: 0x1bf05, // small
0xde: 0x44304, // data
0xe0: 0x4c608, // ononline
0xe1: 0x2c006, // mglyph
0xe3: 0x7f05, // embed
0xe4: 0xf902, // tr
0xe5: 0x4640b, // onloadstart
0xe7: 0x3b010, // ondurationchange
0xed: 0x12503, // bdo
0xee: 0x4702, // td
0xef: 0x4f05, // aside
0xf0: 0x29602, // h2
0xf1: 0x50708, // progress
0xf2: 0x14c0a, // blockquote
0xf4: 0xba05, // label
0xf5: 0x601, // i
0xf7: 0x7707, // rowspan
0xfb: 0x4f209, // onplaying
0xfd: 0x2bf03, // img
0xfe: 0xc008, // optgroup
0xff: 0x42c07, // content
0x101: 0x5190c, // onratechange
0x103: 0x3e80c, // onhashchange
0x104: 0x6507, // details
0x106: 0x40908, // download
0x109: 0xe907, // sandbox
0x10b: 0x42c0f, // contenteditable
0x10d: 0x37c0b, // ondragleave
0x10e: 0x2106, // accept
0x10f: 0x55508, // selected
0x112: 0x2170a, // formaction
0x113: 0x2df06, // center
0x115: 0x44e10, // onloadedmetadata
0x116: 0x14804, // link
0x117: 0x11b04, // time
0x118: 0x1c40b, // crossorigin
0x119: 0x3ce07, // onfocus
0x11a: 0x56204, // wrap
0x11b: 0x42b04, // icon
0x11d: 0x2a905, // video
0x11e: 0x3d905, // class
0x121: 0x5990e, // onvolumechange
0x122: 0x3e206, // onblur
0x123: 0x2e509, // itemscope
0x124: 0x5db05, // style
0x127: 0x42706, // public
0x129: 0x2510e, // formnovalidate
0x12a: 0x55d06, // onshow
0x12c: 0x16609, // translate
0x12d: 0x9704, // cite
0x12e: 0x2e802, // ms
0x12f: 0x1190c, // ontimeupdate
0x130: 0xfd04, // kind
0x131: 0x2660a, // formtarget
0x135: 0x3c007, // onended
0x136: 0x28606, // hidden
0x137: 0x2c01, // s
0x139: 0x2470a, // formmethod
0x13a: 0x44704, // list
0x13c: 0x27002, // h6
0x13d: 0xcd02, // ol
0x13e: 0x3530b, // oncuechange
0x13f: 0x20a0d, // foreignobject
0x143: 0x5c90e, // onbeforeunload
0x145: 0x3a709, // onemptied
0x146: 0x17105, // defer
0x147: 0xef03, // xmp
0x148: 0xaf05, // audio
0x149: 0x1903, // kbd
0x14c: 0x46f09, // onmessage
0x14d: 0x5c506, // option
0x14e: 0x4503, // alt
0x14f: 0x33f07, // checked
0x150: 0x10c08, // autoplay
0x152: 0x202, // br
0x153: 0x2550a, // novalidate
0x156: 0x7d07, // noembed
0x159: 0x2ad07, // onclick
0x15a: 0x4780b, // onmousedown
0x15b: 0x3b808, // onchange
0x15e: 0x3fb09, // oninvalid
0x15f: 0x2e906, // scoped
0x160: 0x1ae08, // controls
0x161: 0x32905, // muted
0x163: 0x4ec06, // usemap
0x164: 0x1dd0a, // figcaption
0x165: 0x36806, // ondrag
0x166: 0x29304, // high
0x168: 0x3d403, // src
0x169: 0x17d06, // poster
0x16b: 0x18d0e, // annotation-xml
0x16c: 0x5bc04, // step
0x16d: 0x4, // abbr
0x16e: 0x1b06, // dialog
0x170: 0x1202, // li
0x172: 0x47a02, // mo
0x175: 0x1fd03, // for
0x176: 0x1cd03, // ins
0x178: 0x53004, // size
0x17a: 0x5207, // default
0x17b: 0x1a03, // bdi
0x17c: 0x4ce0a, // onpagehide
0x17d: 0x9d07, // dirname
0x17e: 0x23304, // type
0x17f: 0x21704, // form
0x180: 0x4c105, // inert
0x181: 0x12709, // oncanplay
0x182: 0x8303, // dfn
0x183: 0x45c08, // tabindex
0x186: 0x7f02, // em
0x187: 0x29c04, // lang
0x189: 0x3a208, // dropzone
0x18a: 0x4110a, // onkeypress
0x18b: 0x25b08, // datetime
0x18c: 0x18804, // cols
0x18d: 0x1, // a
0x18e: 0x43b0c, // onloadeddata
0x191: 0x15606, // border
0x192: 0x2e05, // tbody
0x193: 0x24b06, // method
0x195: 0xbe04, // loop
0x196: 0x2b406, // iframe
0x198: 0x2fa04, // head
0x19e: 0x5b608, // manifest
0x19f: 0xe109, // autofocus
0x1a0: 0x16f04, // code
0x1a1: 0x53406, // strong
0x1a2: 0x32108, // multiple
0x1a3: 0xc05, // param
0x1a6: 0x23007, // enctype
0x1a7: 0x2dd04, // face
0x1a8: 0xf109, // plaintext
0x1a9: 0x13602, // h1
0x1aa: 0x56609, // onstalled
0x1ad: 0x28d06, // script
0x1ae: 0x30006, // spacer
0x1af: 0x52c08, // onresize
0x1b0: 0x49b0b, // onmouseover
0x1b1: 0x59108, // onunload
0x1b2: 0x54208, // onseeked
0x1b4: 0x2330d, // typemustmatch
0x1b5: 0x1f106, // figure
0x1b6: 0x48e0a, // onmouseout
0x1b7: 0x27f03, // pre
0x1b8: 0x4e205, // width
0x1bb: 0x7404, // nobr
0x1be: 0x7002, // tt
0x1bf: 0x1105, // align
0x1c0: 0x3f407, // oninput
0x1c3: 0x42107, // onkeyup
0x1c6: 0x1e50c, // onafterprint
0x1c7: 0x210e, // accept-charset
0x1c8: 0x9806, // itemid
0x1cb: 0x50e06, // strike
0x1cc: 0x57a03, // sub
0x1cd: 0xf905, // track
0x1ce: 0x39705, // start
0x1d0: 0x11408, // basefont
0x1d6: 0x1cf06, // source
0x1d7: 0x1a806, // legend
0x1d8: 0x2f905, // thead
0x1da: 0x2e905, // scope
0x1dd: 0x21106, // object
0x1de: 0xa205, // media
0x1df: 0x18d0a, // annotation
0x1e0: 0x22c0b, // formenctype
0x1e2: 0x28b08, // noscript
0x1e4: 0x53005, // sizes
0x1e5: 0xd50c, // autocomplete
0x1e6: 0x7a04, // span
0x1e7: 0x8508, // noframes
0x1e8: 0x26a06, // target
0x1e9: 0x3a006, // ondrop
0x1ea: 0x3d406, // srcdoc
0x1ec: 0x5e08, // reversed
0x1f0: 0x2c707, // isindex
0x1f3: 0x29808, // hreflang
0x1f5: 0x4e602, // h5
0x1f6: 0x5d507, // address
0x1fa: 0x30603, // max
0x1fb: 0xc70b, // placeholder
0x1fc: 0x31408, // textarea
0x1fe: 0x4a609, // onmouseup
0x1ff: 0x3910b, // ondragstart
}
const atomText = "abbradiogrouparamalignmarkbdialogaccept-charsetbodyaccesskey" +
"genavaluealtdescanvasidefaultfootereversedetailsampatternobr" +
"owspanoembedfnoframesetitleasyncitemidirnamediagroupingaudio" +
"ncancelabelooptgrouplaceholderubyautocompleteautofocusandbox" +
"mplaintextrackindisabledivarautoplaybasefontimeupdatebdoncan" +
"playthrough1bgsoundlowbrbigblinkblockquoteborderbuttonabortr" +
"anslatecodefercolgroupostercolorcolspannotation-xmlcommandra" +
"ggablegendcontrolshapecoordsmallcrossoriginsourcefieldsetfig" +
"captionafterprintfigurequiredforeignObjectforeignobjectforma" +
"ctionbeforeprintformenctypemustmatchallengeformmethodformnov" +
"alidatetimeterformtargeth6heightmlhgroupreloadhiddenoscripth" +
"igh2hreflanghttp-equivideonclickiframeimageimglyph3isindexis" +
"mappletitemrefacenteritemscopeditemtypematheaderspacermaxlen" +
"gth4minmtextareadonlymultiplemutedoncloseamlesspellcheckedon" +
"contextmenuoncuechangeondblclickondragendondragenterondragle" +
"aveondragoverondragstarticleondropzonemptiedondurationchange" +
"onendedonerroronfocusrcdoclassectionbluronhashchangeoninputo" +
"ninvalidonkeydownloadonkeypressrclangonkeyupublicontentedita" +
"bleonloadeddatalistingonloadedmetadatabindexonloadstartonmes" +
"sageonmousedownonmousemoveonmouseoutputonmouseoveronmouseupo" +
"nmousewheelonofflinertononlineonpagehidelonpageshowidth5onpa" +
"usemaponplayingonpopstateonprogresstrikeytypeonratechangeonr" +
"esetonresizestrongonscrollonseekedonseekingonselectedonshowr" +
"aponstalledonstorageonsubmitempropenonsuspendonunloadonvolum" +
"echangeonwaitingoptimumanifestepromptoptionbeforeunloaddress" +
"tylesummarysupsvgsystemarquee"

View File

@@ -0,0 +1,341 @@
// generated by go run gen.go -test; DO NOT EDIT
package atom
var testAtomList = []string{
"a",
"abbr",
"accept",
"accept-charset",
"accesskey",
"action",
"address",
"align",
"alt",
"annotation",
"annotation-xml",
"applet",
"area",
"article",
"aside",
"async",
"audio",
"autocomplete",
"autofocus",
"autoplay",
"b",
"base",
"basefont",
"bdi",
"bdo",
"bgsound",
"big",
"blink",
"blockquote",
"body",
"border",
"br",
"button",
"canvas",
"caption",
"center",
"challenge",
"charset",
"checked",
"cite",
"cite",
"class",
"code",
"col",
"colgroup",
"color",
"cols",
"colspan",
"command",
"command",
"content",
"contenteditable",
"contextmenu",
"controls",
"coords",
"crossorigin",
"data",
"data",
"datalist",
"datetime",
"dd",
"default",
"defer",
"del",
"desc",
"details",
"dfn",
"dialog",
"dir",
"dirname",
"disabled",
"div",
"dl",
"download",
"draggable",
"dropzone",
"dt",
"em",
"embed",
"enctype",
"face",
"fieldset",
"figcaption",
"figure",
"font",
"footer",
"for",
"foreignObject",
"foreignobject",
"form",
"form",
"formaction",
"formenctype",
"formmethod",
"formnovalidate",
"formtarget",
"frame",
"frameset",
"h1",
"h2",
"h3",
"h4",
"h5",
"h6",
"head",
"header",
"headers",
"height",
"hgroup",
"hidden",
"high",
"hr",
"href",
"hreflang",
"html",
"http-equiv",
"i",
"icon",
"id",
"iframe",
"image",
"img",
"inert",
"input",
"ins",
"isindex",
"ismap",
"itemid",
"itemprop",
"itemref",
"itemscope",
"itemtype",
"kbd",
"keygen",
"keytype",
"kind",
"label",
"label",
"lang",
"legend",
"li",
"link",
"list",
"listing",
"loop",
"low",
"malignmark",
"manifest",
"map",
"mark",
"marquee",
"math",
"max",
"maxlength",
"media",
"mediagroup",
"menu",
"meta",
"meter",
"method",
"mglyph",
"mi",
"min",
"mn",
"mo",
"ms",
"mtext",
"multiple",
"muted",
"name",
"nav",
"nobr",
"noembed",
"noframes",
"noscript",
"novalidate",
"object",
"ol",
"onabort",
"onafterprint",
"onbeforeprint",
"onbeforeunload",
"onblur",
"oncancel",
"oncanplay",
"oncanplaythrough",
"onchange",
"onclick",
"onclose",
"oncontextmenu",
"oncuechange",
"ondblclick",
"ondrag",
"ondragend",
"ondragenter",
"ondragleave",
"ondragover",
"ondragstart",
"ondrop",
"ondurationchange",
"onemptied",
"onended",
"onerror",
"onfocus",
"onhashchange",
"oninput",
"oninvalid",
"onkeydown",
"onkeypress",
"onkeyup",
"onload",
"onloadeddata",
"onloadedmetadata",
"onloadstart",
"onmessage",
"onmousedown",
"onmousemove",
"onmouseout",
"onmouseover",
"onmouseup",
"onmousewheel",
"onoffline",
"ononline",
"onpagehide",
"onpageshow",
"onpause",
"onplay",
"onplaying",
"onpopstate",
"onprogress",
"onratechange",
"onreset",
"onresize",
"onscroll",
"onseeked",
"onseeking",
"onselect",
"onshow",
"onstalled",
"onstorage",
"onsubmit",
"onsuspend",
"ontimeupdate",
"onunload",
"onvolumechange",
"onwaiting",
"open",
"optgroup",
"optimum",
"option",
"output",
"p",
"param",
"pattern",
"ping",
"placeholder",
"plaintext",
"poster",
"pre",
"preload",
"progress",
"prompt",
"public",
"q",
"radiogroup",
"readonly",
"rel",
"required",
"reversed",
"rows",
"rowspan",
"rp",
"rt",
"ruby",
"s",
"samp",
"sandbox",
"scope",
"scoped",
"script",
"seamless",
"section",
"select",
"selected",
"shape",
"size",
"sizes",
"small",
"source",
"spacer",
"span",
"span",
"spellcheck",
"src",
"srcdoc",
"srclang",
"start",
"step",
"strike",
"strong",
"style",
"style",
"sub",
"summary",
"sup",
"svg",
"system",
"tabindex",
"table",
"target",
"tbody",
"td",
"textarea",
"tfoot",
"th",
"thead",
"time",
"title",
"title",
"tr",
"track",
"translate",
"tt",
"type",
"typemustmatch",
"u",
"ul",
"usemap",
"value",
"var",
"video",
"wbr",
"width",
"wrap",
"xmp",
}

View File

@@ -0,0 +1,227 @@
// Package charset provides common text encodings for HTML documents.
//
// The mapping from encoding labels to encodings is defined at
// http://encoding.spec.whatwg.org.
package charset
import (
"bytes"
"io"
"mime"
"strings"
"unicode/utf8"
"code.google.com/p/go.net/html"
"code.google.com/p/go.text/encoding"
"code.google.com/p/go.text/encoding/charmap"
"code.google.com/p/go.text/transform"
)
// Lookup returns the encoding with the specified label, and its canonical
// name. It returns nil and the empty string if label is not one of the
// standard encodings for HTML. Matching is case-insensitive and ignores
// leading and trailing whitespace.
func Lookup(label string) (e encoding.Encoding, name string) {
label = strings.ToLower(strings.Trim(label, "\t\n\r\f "))
enc := encodings[label]
return enc.e, enc.name
}
// DetermineEncoding determines the encoding of an HTML document by examining
// up to the first 1024 bytes of content and the declared Content-Type.
//
// See http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#determining-the-character-encoding
func DetermineEncoding(content []byte, contentType string) (e encoding.Encoding, name string, certain bool) {
if len(content) > 1024 {
content = content[:1024]
}
for _, b := range boms {
if bytes.HasPrefix(content, b.bom) {
e, name = Lookup(b.enc)
return e, name, true
}
}
if _, params, err := mime.ParseMediaType(contentType); err == nil {
if cs, ok := params["charset"]; ok {
if e, name = Lookup(cs); e != nil {
return e, name, true
}
}
}
if len(content) > 0 {
e, name = prescan(content)
if e != nil {
return e, name, false
}
}
// Try to detect UTF-8.
// First eliminate any partial rune at the end.
for i := len(content) - 1; i >= 0 && i > len(content)-4; i-- {
b := content[i]
if b < 0x80 {
break
}
if utf8.RuneStart(b) {
content = content[:i]
break
}
}
hasHighBit := false
for _, c := range content {
if c >= 0x80 {
hasHighBit = true
break
}
}
if hasHighBit && utf8.Valid(content) {
return encoding.Nop, "utf-8", false
}
// TODO: change default depending on user's locale?
return charmap.Windows1252, "windows-1252", false
}
// NewReader returns an io.Reader that converts the content of r to UTF-8.
// It calls DetermineEncoding to find out what r's encoding is.
func NewReader(r io.Reader, contentType string) (io.Reader, error) {
preview := make([]byte, 1024)
n, err := io.ReadFull(r, preview)
switch {
case err == io.ErrUnexpectedEOF:
preview = preview[:n]
r = bytes.NewReader(preview)
case err != nil:
return nil, err
default:
r = io.MultiReader(bytes.NewReader(preview), r)
}
if e, _, _ := DetermineEncoding(preview, contentType); e != encoding.Nop {
r = transform.NewReader(r, e.NewDecoder())
}
return r, nil
}
func prescan(content []byte) (e encoding.Encoding, name string) {
z := html.NewTokenizer(bytes.NewReader(content))
for {
switch z.Next() {
case html.ErrorToken:
return nil, ""
case html.StartTagToken, html.SelfClosingTagToken:
tagName, hasAttr := z.TagName()
if !bytes.Equal(tagName, []byte("meta")) {
continue
}
attrList := make(map[string]bool)
gotPragma := false
const (
dontKnow = iota
doNeedPragma
doNotNeedPragma
)
needPragma := dontKnow
name = ""
e = nil
for hasAttr {
var key, val []byte
key, val, hasAttr = z.TagAttr()
ks := string(key)
if attrList[ks] {
continue
}
attrList[ks] = true
for i, c := range val {
if 'A' <= c && c <= 'Z' {
val[i] = c + 0x20
}
}
switch ks {
case "http-equiv":
if bytes.Equal(val, []byte("content-type")) {
gotPragma = true
}
case "content":
if e == nil {
name = fromMetaElement(string(val))
if name != "" {
e, name = Lookup(name)
if e != nil {
needPragma = doNeedPragma
}
}
}
case "charset":
e, name = Lookup(string(val))
needPragma = doNotNeedPragma
}
}
if needPragma == dontKnow || needPragma == doNeedPragma && !gotPragma {
continue
}
if strings.HasPrefix(name, "utf-16") {
name = "utf-8"
e = encoding.Nop
}
if e != nil {
return e, name
}
}
}
}
func fromMetaElement(s string) string {
for s != "" {
csLoc := strings.Index(s, "charset")
if csLoc == -1 {
return ""
}
s = s[csLoc+len("charset"):]
s = strings.TrimLeft(s, " \t\n\f\r")
if !strings.HasPrefix(s, "=") {
continue
}
s = s[1:]
s = strings.TrimLeft(s, " \t\n\f\r")
if s == "" {
return ""
}
if q := s[0]; q == '"' || q == '\'' {
s = s[1:]
closeQuote := strings.IndexRune(s, rune(q))
if closeQuote == -1 {
return ""
}
return s[:closeQuote]
}
end := strings.IndexAny(s, "; \t\n\f\r")
if end == -1 {
end = len(s)
}
return s[:end]
}
return ""
}
var boms = []struct {
bom []byte
enc string
}{
{[]byte{0xfe, 0xff}, "utf-16be"},
{[]byte{0xff, 0xfe}, "utf-16le"},
{[]byte{0xef, 0xbb, 0xbf}, "utf-8"},
}

View File

@@ -0,0 +1,200 @@
package charset
import (
"bytes"
"io/ioutil"
"strings"
"testing"
"code.google.com/p/go.text/transform"
)
func transformString(t transform.Transformer, s string) (string, error) {
r := transform.NewReader(strings.NewReader(s), t)
b, err := ioutil.ReadAll(r)
return string(b), err
}
var testCases = []struct {
utf8, other, otherEncoding string
}{
{"Résumé", "Résumé", "utf8"},
{"Résumé", "R\xe9sum\xe9", "latin1"},
{"これは漢字です。", "S0\x8c0o0\"oW[g0Y0\x020", "UTF-16LE"},
{"これは漢字です。", "0S0\x8c0oo\"[W0g0Y0\x02", "UTF-16BE"},
{"Hello, world", "Hello, world", "ASCII"},
{"Gdańsk", "Gda\xf1sk", "ISO-8859-2"},
{"Ââ Čč Đđ Ŋŋ Õõ Šš Žž Åå Ää", "\xc2\xe2 \xc8\xe8 \xa9\xb9 \xaf\xbf \xd5\xf5 \xaa\xba \xac\xbc \xc5\xe5 \xc4\xe4", "ISO-8859-10"},
{"สำหรับ", "\xca\xd3\xcb\xc3\u047a", "ISO-8859-11"},
{"latviešu", "latvie\xf0u", "ISO-8859-13"},
{"Seònaid", "Se\xf2naid", "ISO-8859-14"},
{"€1 is cheap", "\xa41 is cheap", "ISO-8859-15"},
{"românește", "rom\xe2ne\xbate", "ISO-8859-16"},
{"nutraĵo", "nutra\xbco", "ISO-8859-3"},
{"Kalâdlit", "Kal\xe2dlit", "ISO-8859-4"},
{"русский", "\xe0\xe3\xe1\xe1\xda\xd8\xd9", "ISO-8859-5"},
{"ελληνικά", "\xe5\xeb\xeb\xe7\xed\xe9\xea\xdc", "ISO-8859-7"},
{"Kağan", "Ka\xf0an", "ISO-8859-9"},
{"Résumé", "R\x8esum\x8e", "macintosh"},
{"Gdańsk", "Gda\xf1sk", "windows-1250"},
{"русский", "\xf0\xf3\xf1\xf1\xea\xe8\xe9", "windows-1251"},
{"Résumé", "R\xe9sum\xe9", "windows-1252"},
{"ελληνικά", "\xe5\xeb\xeb\xe7\xed\xe9\xea\xdc", "windows-1253"},
{"Kağan", "Ka\xf0an", "windows-1254"},
{"עִבְרִית", "\xf2\xc4\xe1\xc0\xf8\xc4\xe9\xfa", "windows-1255"},
{"العربية", "\xc7\xe1\xda\xd1\xc8\xed\xc9", "windows-1256"},
{"latviešu", "latvie\xf0u", "windows-1257"},
{"Việt", "Vi\xea\xf2t", "windows-1258"},
{"สำหรับ", "\xca\xd3\xcb\xc3\u047a", "windows-874"},
{"русский", "\xd2\xd5\xd3\xd3\xcb\xc9\xca", "KOI8-R"},
{"українська", "\xd5\xcb\xd2\xc1\xa7\xce\xd3\xd8\xcb\xc1", "KOI8-U"},
{"Hello 常用國字標準字體表", "Hello \xb1`\xa5\u03b0\xea\xa6r\xbc\u0437\u01e6r\xc5\xe9\xaa\xed", "big5"},
{"Hello 常用國字標準字體表", "Hello \xb3\xa3\xd3\xc3\x87\xf8\xd7\xd6\x98\xcb\x9c\xca\xd7\xd6\xf3\x77\xb1\xed", "gbk"},
{"Hello 常用國字標準字體表", "Hello \xb3\xa3\xd3\xc3\x87\xf8\xd7\xd6\x98\xcb\x9c\xca\xd7\xd6\xf3\x77\xb1\xed", "gb18030"},
{"עִבְרִית", "\x81\x30\xfb\x30\x81\x30\xf6\x34\x81\x30\xf9\x33\x81\x30\xf6\x30\x81\x30\xfb\x36\x81\x30\xf6\x34\x81\x30\xfa\x31\x81\x30\xfb\x38", "gb18030"},
{"㧯", "\x82\x31\x89\x38", "gb18030"},
{"これは漢字です。", "\x82\xb1\x82\xea\x82\xcd\x8a\xbf\x8e\x9a\x82\xc5\x82\xb7\x81B", "SJIS"},
{"Hello, 世界!", "Hello, \x90\xa2\x8aE!", "SJIS"},
{"イウエオカ", "\xb2\xb3\xb4\xb5\xb6", "SJIS"},
{"これは漢字です。", "\xa4\xb3\xa4\xec\xa4\u03f4\xc1\xbb\xfa\xa4\u01e4\xb9\xa1\xa3", "EUC-JP"},
{"Hello, 世界!", "Hello, \x1b$B@$3&\x1b(B!", "ISO-2022-JP"},
{"네이트 | 즐거움의 시작, 슈파스(Spaβ) NATE", "\xb3\xd7\xc0\xcc\xc6\xae | \xc1\xf1\xb0\xc5\xbf\xf2\xc0\xc7 \xbd\xc3\xc0\xdb, \xbd\xb4\xc6\xc4\xbd\xba(Spa\xa5\xe2) NATE", "EUC-KR"},
}
func TestDecode(t *testing.T) {
for _, tc := range testCases {
e, _ := Lookup(tc.otherEncoding)
if e == nil {
t.Errorf("%s: not found", tc.otherEncoding)
continue
}
s, err := transformString(e.NewDecoder(), tc.other)
if err != nil {
t.Errorf("%s: decode %q: %v", tc.otherEncoding, tc.other, err)
continue
}
if s != tc.utf8 {
t.Errorf("%s: got %q, want %q", tc.otherEncoding, s, tc.utf8)
}
}
}
func TestEncode(t *testing.T) {
for _, tc := range testCases {
e, _ := Lookup(tc.otherEncoding)
if e == nil {
t.Errorf("%s: not found", tc.otherEncoding)
continue
}
s, err := transformString(e.NewEncoder(), tc.utf8)
if err != nil {
t.Errorf("%s: encode %q: %s", tc.otherEncoding, tc.utf8, err)
continue
}
if s != tc.other {
t.Errorf("%s: got %q, want %q", tc.otherEncoding, s, tc.other)
}
}
}
// TestNames verifies that you can pass an encoding's name to Lookup and get
// the same encoding back (except for "replacement").
func TestNames(t *testing.T) {
for _, e := range encodings {
if e.name == "replacement" {
continue
}
_, got := Lookup(e.name)
if got != e.name {
t.Errorf("got %q, want %q", got, e.name)
continue
}
}
}
var sniffTestCases = []struct {
filename, declared, want string
}{
{"HTTP-charset.html", "text/html; charset=iso-8859-15", "iso-8859-15"},
{"UTF-16LE-BOM.html", "", "utf-16le"},
{"UTF-16BE-BOM.html", "", "utf-16be"},
{"meta-content-attribute.html", "text/html", "iso-8859-15"},
{"meta-charset-attribute.html", "text/html", "iso-8859-15"},
{"No-encoding-declaration.html", "text/html", "utf-8"},
{"HTTP-vs-UTF-8-BOM.html", "text/html; charset=iso-8859-15", "utf-8"},
{"HTTP-vs-meta-content.html", "text/html; charset=iso-8859-15", "iso-8859-15"},
{"HTTP-vs-meta-charset.html", "text/html; charset=iso-8859-15", "iso-8859-15"},
{"UTF-8-BOM-vs-meta-content.html", "text/html", "utf-8"},
{"UTF-8-BOM-vs-meta-charset.html", "text/html", "utf-8"},
}
func TestSniff(t *testing.T) {
for _, tc := range sniffTestCases {
content, err := ioutil.ReadFile("testdata/" + tc.filename)
if err != nil {
t.Errorf("%s: error reading file: %v", tc.filename, err)
continue
}
_, name, _ := DetermineEncoding(content, tc.declared)
if name != tc.want {
t.Errorf("%s: got %q, want %q", tc.filename, name, tc.want)
continue
}
}
}
func TestReader(t *testing.T) {
for _, tc := range sniffTestCases {
content, err := ioutil.ReadFile("testdata/" + tc.filename)
if err != nil {
t.Errorf("%s: error reading file: %v", tc.filename, err)
continue
}
r, err := NewReader(bytes.NewReader(content), tc.declared)
if err != nil {
t.Errorf("%s: error creating reader: %v", tc.filename, err)
continue
}
got, err := ioutil.ReadAll(r)
if err != nil {
t.Errorf("%s: error reading from charset.NewReader: %v", tc.filename, err)
continue
}
e, _ := Lookup(tc.want)
want, err := ioutil.ReadAll(transform.NewReader(bytes.NewReader(content), e.NewDecoder()))
if err != nil {
t.Errorf("%s: error decoding with hard-coded charset name: %v", tc.filename, err)
continue
}
if !bytes.Equal(got, want) {
t.Errorf("%s: got %q, want %q", tc.filename, got, want)
continue
}
}
}
var metaTestCases = []struct {
meta, want string
}{
{"", ""},
{"text/html", ""},
{"text/html; charset utf-8", ""},
{"text/html; charset=latin-2", "latin-2"},
{"text/html; charset; charset = utf-8", "utf-8"},
{`charset="big5"`, "big5"},
{"charset='shift_jis'", "shift_jis"},
}
func TestFromMeta(t *testing.T) {
for _, tc := range metaTestCases {
got := fromMetaElement(tc.meta)
if got != tc.want {
t.Errorf("%q: got %q, want %q", tc.meta, got, tc.want)
}
}
}

View File

@@ -0,0 +1,107 @@
// +build ignore
package main
// Download http://encoding.spec.whatwg.org/encodings.json and use it to
// generate table.go.
import (
"encoding/json"
"fmt"
"log"
"net/http"
"strings"
)
type enc struct {
Name string
Labels []string
}
type group struct {
Encodings []enc
Heading string
}
const specURL = "http://encoding.spec.whatwg.org/encodings.json"
func main() {
resp, err := http.Get(specURL)
if err != nil {
log.Fatalf("error fetching %s: %s", specURL, err)
}
if resp.StatusCode != 200 {
log.Fatalf("error fetching %s: HTTP status %s", specURL, resp.Status)
}
defer resp.Body.Close()
var groups []group
d := json.NewDecoder(resp.Body)
err = d.Decode(&groups)
if err != nil {
log.Fatalf("error reading encodings.json: %s", err)
}
fmt.Println("// generated by go run gen.go; DO NOT EDIT")
fmt.Println()
fmt.Println("package charset")
fmt.Println()
fmt.Println("import (")
fmt.Println(`"code.google.com/p/go.text/encoding"`)
for _, pkg := range []string{"charmap", "japanese", "korean", "simplifiedchinese", "traditionalchinese", "unicode"} {
fmt.Printf("\"code.google.com/p/go.text/encoding/%s\"\n", pkg)
}
fmt.Println(")")
fmt.Println()
fmt.Println("var encodings = map[string]struct{e encoding.Encoding; name string} {")
for _, g := range groups {
for _, e := range g.Encodings {
goName, ok := miscNames[e.Name]
if !ok {
for k, v := range prefixes {
if strings.HasPrefix(e.Name, k) {
goName = v + e.Name[len(k):]
break
}
}
if goName == "" {
log.Fatalf("unrecognized encoding name: %s", e.Name)
}
}
for _, label := range e.Labels {
fmt.Printf("%q: {%s, %q},\n", label, goName, e.Name)
}
}
}
fmt.Println("}")
}
var prefixes = map[string]string{
"iso-8859-": "charmap.ISO8859_",
"windows-": "charmap.Windows",
}
var miscNames = map[string]string{
"utf-8": "encoding.Nop",
"ibm866": "charmap.CodePage866",
"iso-8859-8-i": "charmap.ISO8859_8",
"koi8-r": "charmap.KOI8R",
"koi8-u": "charmap.KOI8U",
"macintosh": "charmap.Macintosh",
"x-mac-cyrillic": "charmap.MacintoshCyrillic",
"gbk": "simplifiedchinese.GBK",
"gb18030": "simplifiedchinese.GB18030",
"hz-gb-2312": "simplifiedchinese.HZGB2312",
"big5": "traditionalchinese.Big5",
"euc-jp": "japanese.EUCJP",
"iso-2022-jp": "japanese.ISO2022JP",
"shift_jis": "japanese.ShiftJIS",
"euc-kr": "korean.EUCKR",
"replacement": "encoding.Replacement",
"utf-16be": "unicode.UTF16(unicode.BigEndian, unicode.IgnoreBOM)",
"utf-16le": "unicode.UTF16(unicode.LittleEndian, unicode.IgnoreBOM)",
"x-user-defined": "charmap.XUserDefined",
}

View File

@@ -0,0 +1,235 @@
// generated by go run gen.go; DO NOT EDIT
package charset
import (
"code.google.com/p/go.text/encoding"
"code.google.com/p/go.text/encoding/charmap"
"code.google.com/p/go.text/encoding/japanese"
"code.google.com/p/go.text/encoding/korean"
"code.google.com/p/go.text/encoding/simplifiedchinese"
"code.google.com/p/go.text/encoding/traditionalchinese"
"code.google.com/p/go.text/encoding/unicode"
)
var encodings = map[string]struct {
e encoding.Encoding
name string
}{
"unicode-1-1-utf-8": {encoding.Nop, "utf-8"},
"utf-8": {encoding.Nop, "utf-8"},
"utf8": {encoding.Nop, "utf-8"},
"866": {charmap.CodePage866, "ibm866"},
"cp866": {charmap.CodePage866, "ibm866"},
"csibm866": {charmap.CodePage866, "ibm866"},
"ibm866": {charmap.CodePage866, "ibm866"},
"csisolatin2": {charmap.ISO8859_2, "iso-8859-2"},
"iso-8859-2": {charmap.ISO8859_2, "iso-8859-2"},
"iso-ir-101": {charmap.ISO8859_2, "iso-8859-2"},
"iso8859-2": {charmap.ISO8859_2, "iso-8859-2"},
"iso88592": {charmap.ISO8859_2, "iso-8859-2"},
"iso_8859-2": {charmap.ISO8859_2, "iso-8859-2"},
"iso_8859-2:1987": {charmap.ISO8859_2, "iso-8859-2"},
"l2": {charmap.ISO8859_2, "iso-8859-2"},
"latin2": {charmap.ISO8859_2, "iso-8859-2"},
"csisolatin3": {charmap.ISO8859_3, "iso-8859-3"},
"iso-8859-3": {charmap.ISO8859_3, "iso-8859-3"},
"iso-ir-109": {charmap.ISO8859_3, "iso-8859-3"},
"iso8859-3": {charmap.ISO8859_3, "iso-8859-3"},
"iso88593": {charmap.ISO8859_3, "iso-8859-3"},
"iso_8859-3": {charmap.ISO8859_3, "iso-8859-3"},
"iso_8859-3:1988": {charmap.ISO8859_3, "iso-8859-3"},
"l3": {charmap.ISO8859_3, "iso-8859-3"},
"latin3": {charmap.ISO8859_3, "iso-8859-3"},
"csisolatin4": {charmap.ISO8859_4, "iso-8859-4"},
"iso-8859-4": {charmap.ISO8859_4, "iso-8859-4"},
"iso-ir-110": {charmap.ISO8859_4, "iso-8859-4"},
"iso8859-4": {charmap.ISO8859_4, "iso-8859-4"},
"iso88594": {charmap.ISO8859_4, "iso-8859-4"},
"iso_8859-4": {charmap.ISO8859_4, "iso-8859-4"},
"iso_8859-4:1988": {charmap.ISO8859_4, "iso-8859-4"},
"l4": {charmap.ISO8859_4, "iso-8859-4"},
"latin4": {charmap.ISO8859_4, "iso-8859-4"},
"csisolatincyrillic": {charmap.ISO8859_5, "iso-8859-5"},
"cyrillic": {charmap.ISO8859_5, "iso-8859-5"},
"iso-8859-5": {charmap.ISO8859_5, "iso-8859-5"},
"iso-ir-144": {charmap.ISO8859_5, "iso-8859-5"},
"iso8859-5": {charmap.ISO8859_5, "iso-8859-5"},
"iso88595": {charmap.ISO8859_5, "iso-8859-5"},
"iso_8859-5": {charmap.ISO8859_5, "iso-8859-5"},
"iso_8859-5:1988": {charmap.ISO8859_5, "iso-8859-5"},
"arabic": {charmap.ISO8859_6, "iso-8859-6"},
"asmo-708": {charmap.ISO8859_6, "iso-8859-6"},
"csiso88596e": {charmap.ISO8859_6, "iso-8859-6"},
"csiso88596i": {charmap.ISO8859_6, "iso-8859-6"},
"csisolatinarabic": {charmap.ISO8859_6, "iso-8859-6"},
"ecma-114": {charmap.ISO8859_6, "iso-8859-6"},
"iso-8859-6": {charmap.ISO8859_6, "iso-8859-6"},
"iso-8859-6-e": {charmap.ISO8859_6, "iso-8859-6"},
"iso-8859-6-i": {charmap.ISO8859_6, "iso-8859-6"},
"iso-ir-127": {charmap.ISO8859_6, "iso-8859-6"},
"iso8859-6": {charmap.ISO8859_6, "iso-8859-6"},
"iso88596": {charmap.ISO8859_6, "iso-8859-6"},
"iso_8859-6": {charmap.ISO8859_6, "iso-8859-6"},
"iso_8859-6:1987": {charmap.ISO8859_6, "iso-8859-6"},
"csisolatingreek": {charmap.ISO8859_7, "iso-8859-7"},
"ecma-118": {charmap.ISO8859_7, "iso-8859-7"},
"elot_928": {charmap.ISO8859_7, "iso-8859-7"},
"greek": {charmap.ISO8859_7, "iso-8859-7"},
"greek8": {charmap.ISO8859_7, "iso-8859-7"},
"iso-8859-7": {charmap.ISO8859_7, "iso-8859-7"},
"iso-ir-126": {charmap.ISO8859_7, "iso-8859-7"},
"iso8859-7": {charmap.ISO8859_7, "iso-8859-7"},
"iso88597": {charmap.ISO8859_7, "iso-8859-7"},
"iso_8859-7": {charmap.ISO8859_7, "iso-8859-7"},
"iso_8859-7:1987": {charmap.ISO8859_7, "iso-8859-7"},
"sun_eu_greek": {charmap.ISO8859_7, "iso-8859-7"},
"csiso88598e": {charmap.ISO8859_8, "iso-8859-8"},
"csisolatinhebrew": {charmap.ISO8859_8, "iso-8859-8"},
"hebrew": {charmap.ISO8859_8, "iso-8859-8"},
"iso-8859-8": {charmap.ISO8859_8, "iso-8859-8"},
"iso-8859-8-e": {charmap.ISO8859_8, "iso-8859-8"},
"iso-ir-138": {charmap.ISO8859_8, "iso-8859-8"},
"iso8859-8": {charmap.ISO8859_8, "iso-8859-8"},
"iso88598": {charmap.ISO8859_8, "iso-8859-8"},
"iso_8859-8": {charmap.ISO8859_8, "iso-8859-8"},
"iso_8859-8:1988": {charmap.ISO8859_8, "iso-8859-8"},
"visual": {charmap.ISO8859_8, "iso-8859-8"},
"csiso88598i": {charmap.ISO8859_8, "iso-8859-8-i"},
"iso-8859-8-i": {charmap.ISO8859_8, "iso-8859-8-i"},
"logical": {charmap.ISO8859_8, "iso-8859-8-i"},
"csisolatin6": {charmap.ISO8859_10, "iso-8859-10"},
"iso-8859-10": {charmap.ISO8859_10, "iso-8859-10"},
"iso-ir-157": {charmap.ISO8859_10, "iso-8859-10"},
"iso8859-10": {charmap.ISO8859_10, "iso-8859-10"},
"iso885910": {charmap.ISO8859_10, "iso-8859-10"},
"l6": {charmap.ISO8859_10, "iso-8859-10"},
"latin6": {charmap.ISO8859_10, "iso-8859-10"},
"iso-8859-13": {charmap.ISO8859_13, "iso-8859-13"},
"iso8859-13": {charmap.ISO8859_13, "iso-8859-13"},
"iso885913": {charmap.ISO8859_13, "iso-8859-13"},
"iso-8859-14": {charmap.ISO8859_14, "iso-8859-14"},
"iso8859-14": {charmap.ISO8859_14, "iso-8859-14"},
"iso885914": {charmap.ISO8859_14, "iso-8859-14"},
"csisolatin9": {charmap.ISO8859_15, "iso-8859-15"},
"iso-8859-15": {charmap.ISO8859_15, "iso-8859-15"},
"iso8859-15": {charmap.ISO8859_15, "iso-8859-15"},
"iso885915": {charmap.ISO8859_15, "iso-8859-15"},
"iso_8859-15": {charmap.ISO8859_15, "iso-8859-15"},
"l9": {charmap.ISO8859_15, "iso-8859-15"},
"iso-8859-16": {charmap.ISO8859_16, "iso-8859-16"},
"cskoi8r": {charmap.KOI8R, "koi8-r"},
"koi": {charmap.KOI8R, "koi8-r"},
"koi8": {charmap.KOI8R, "koi8-r"},
"koi8-r": {charmap.KOI8R, "koi8-r"},
"koi8_r": {charmap.KOI8R, "koi8-r"},
"koi8-u": {charmap.KOI8U, "koi8-u"},
"csmacintosh": {charmap.Macintosh, "macintosh"},
"mac": {charmap.Macintosh, "macintosh"},
"macintosh": {charmap.Macintosh, "macintosh"},
"x-mac-roman": {charmap.Macintosh, "macintosh"},
"dos-874": {charmap.Windows874, "windows-874"},
"iso-8859-11": {charmap.Windows874, "windows-874"},
"iso8859-11": {charmap.Windows874, "windows-874"},
"iso885911": {charmap.Windows874, "windows-874"},
"tis-620": {charmap.Windows874, "windows-874"},
"windows-874": {charmap.Windows874, "windows-874"},
"cp1250": {charmap.Windows1250, "windows-1250"},
"windows-1250": {charmap.Windows1250, "windows-1250"},
"x-cp1250": {charmap.Windows1250, "windows-1250"},
"cp1251": {charmap.Windows1251, "windows-1251"},
"windows-1251": {charmap.Windows1251, "windows-1251"},
"x-cp1251": {charmap.Windows1251, "windows-1251"},
"ansi_x3.4-1968": {charmap.Windows1252, "windows-1252"},
"ascii": {charmap.Windows1252, "windows-1252"},
"cp1252": {charmap.Windows1252, "windows-1252"},
"cp819": {charmap.Windows1252, "windows-1252"},
"csisolatin1": {charmap.Windows1252, "windows-1252"},
"ibm819": {charmap.Windows1252, "windows-1252"},
"iso-8859-1": {charmap.Windows1252, "windows-1252"},
"iso-ir-100": {charmap.Windows1252, "windows-1252"},
"iso8859-1": {charmap.Windows1252, "windows-1252"},
"iso88591": {charmap.Windows1252, "windows-1252"},
"iso_8859-1": {charmap.Windows1252, "windows-1252"},
"iso_8859-1:1987": {charmap.Windows1252, "windows-1252"},
"l1": {charmap.Windows1252, "windows-1252"},
"latin1": {charmap.Windows1252, "windows-1252"},
"us-ascii": {charmap.Windows1252, "windows-1252"},
"windows-1252": {charmap.Windows1252, "windows-1252"},
"x-cp1252": {charmap.Windows1252, "windows-1252"},
"cp1253": {charmap.Windows1253, "windows-1253"},
"windows-1253": {charmap.Windows1253, "windows-1253"},
"x-cp1253": {charmap.Windows1253, "windows-1253"},
"cp1254": {charmap.Windows1254, "windows-1254"},
"csisolatin5": {charmap.Windows1254, "windows-1254"},
"iso-8859-9": {charmap.Windows1254, "windows-1254"},
"iso-ir-148": {charmap.Windows1254, "windows-1254"},
"iso8859-9": {charmap.Windows1254, "windows-1254"},
"iso88599": {charmap.Windows1254, "windows-1254"},
"iso_8859-9": {charmap.Windows1254, "windows-1254"},
"iso_8859-9:1989": {charmap.Windows1254, "windows-1254"},
"l5": {charmap.Windows1254, "windows-1254"},
"latin5": {charmap.Windows1254, "windows-1254"},
"windows-1254": {charmap.Windows1254, "windows-1254"},
"x-cp1254": {charmap.Windows1254, "windows-1254"},
"cp1255": {charmap.Windows1255, "windows-1255"},
"windows-1255": {charmap.Windows1255, "windows-1255"},
"x-cp1255": {charmap.Windows1255, "windows-1255"},
"cp1256": {charmap.Windows1256, "windows-1256"},
"windows-1256": {charmap.Windows1256, "windows-1256"},
"x-cp1256": {charmap.Windows1256, "windows-1256"},
"cp1257": {charmap.Windows1257, "windows-1257"},
"windows-1257": {charmap.Windows1257, "windows-1257"},
"x-cp1257": {charmap.Windows1257, "windows-1257"},
"cp1258": {charmap.Windows1258, "windows-1258"},
"windows-1258": {charmap.Windows1258, "windows-1258"},
"x-cp1258": {charmap.Windows1258, "windows-1258"},
"x-mac-cyrillic": {charmap.MacintoshCyrillic, "x-mac-cyrillic"},
"x-mac-ukrainian": {charmap.MacintoshCyrillic, "x-mac-cyrillic"},
"chinese": {simplifiedchinese.GBK, "gbk"},
"csgb2312": {simplifiedchinese.GBK, "gbk"},
"csiso58gb231280": {simplifiedchinese.GBK, "gbk"},
"gb2312": {simplifiedchinese.GBK, "gbk"},
"gb_2312": {simplifiedchinese.GBK, "gbk"},
"gb_2312-80": {simplifiedchinese.GBK, "gbk"},
"gbk": {simplifiedchinese.GBK, "gbk"},
"iso-ir-58": {simplifiedchinese.GBK, "gbk"},
"x-gbk": {simplifiedchinese.GBK, "gbk"},
"gb18030": {simplifiedchinese.GB18030, "gb18030"},
"hz-gb-2312": {simplifiedchinese.HZGB2312, "hz-gb-2312"},
"big5": {traditionalchinese.Big5, "big5"},
"big5-hkscs": {traditionalchinese.Big5, "big5"},
"cn-big5": {traditionalchinese.Big5, "big5"},
"csbig5": {traditionalchinese.Big5, "big5"},
"x-x-big5": {traditionalchinese.Big5, "big5"},
"cseucpkdfmtjapanese": {japanese.EUCJP, "euc-jp"},
"euc-jp": {japanese.EUCJP, "euc-jp"},
"x-euc-jp": {japanese.EUCJP, "euc-jp"},
"csiso2022jp": {japanese.ISO2022JP, "iso-2022-jp"},
"iso-2022-jp": {japanese.ISO2022JP, "iso-2022-jp"},
"csshiftjis": {japanese.ShiftJIS, "shift_jis"},
"ms_kanji": {japanese.ShiftJIS, "shift_jis"},
"shift-jis": {japanese.ShiftJIS, "shift_jis"},
"shift_jis": {japanese.ShiftJIS, "shift_jis"},
"sjis": {japanese.ShiftJIS, "shift_jis"},
"windows-31j": {japanese.ShiftJIS, "shift_jis"},
"x-sjis": {japanese.ShiftJIS, "shift_jis"},
"cseuckr": {korean.EUCKR, "euc-kr"},
"csksc56011987": {korean.EUCKR, "euc-kr"},
"euc-kr": {korean.EUCKR, "euc-kr"},
"iso-ir-149": {korean.EUCKR, "euc-kr"},
"korean": {korean.EUCKR, "euc-kr"},
"ks_c_5601-1987": {korean.EUCKR, "euc-kr"},
"ks_c_5601-1989": {korean.EUCKR, "euc-kr"},
"ksc5601": {korean.EUCKR, "euc-kr"},
"ksc_5601": {korean.EUCKR, "euc-kr"},
"windows-949": {korean.EUCKR, "euc-kr"},
"csiso2022kr": {encoding.Replacement, "replacement"},
"iso-2022-kr": {encoding.Replacement, "replacement"},
"iso-2022-cn": {encoding.Replacement, "replacement"},
"iso-2022-cn-ext": {encoding.Replacement, "replacement"},
"utf-16be": {unicode.UTF16(unicode.BigEndian, unicode.IgnoreBOM), "utf-16be"},
"utf-16": {unicode.UTF16(unicode.LittleEndian, unicode.IgnoreBOM), "utf-16le"},
"utf-16le": {unicode.UTF16(unicode.LittleEndian, unicode.IgnoreBOM), "utf-16le"},
"x-user-defined": {charmap.XUserDefined, "x-user-defined"},
}

View File

@@ -0,0 +1,48 @@
<!DOCTYPE html>
<html lang="en" >
<head>
<title>HTTP charset</title>
<link rel='author' title='Richard Ishida' href='mailto:ishida@w3.org'>
<link rel='help' href='http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream'>
<link rel="stylesheet" type="text/css" href="./generatedtests.css">
<script src="http://w3c-test.org/resources/testharness.js"></script>
<script src="http://w3c-test.org/resources/testharnessreport.js"></script>
<meta name='flags' content='http'>
<meta name="assert" content="The character encoding of a page can be set using the HTTP header charset declaration.">
<style type='text/css'>
.test div { width: 50px; }</style>
<link rel="stylesheet" type="text/css" href="the-input-byte-stream/support/encodingtests-15.css">
</head>
<body>
<p class='title'>HTTP charset</p>
<div id='log'></div>
<div class='test'><div id='box' class='ýäè'>&#xA0;</div></div>
<div class='description'>
<p class="assertion" title="Assertion">The character encoding of a page can be set using the HTTP header charset declaration.</p>
<div class="notes"><p><p>The test contains a div with a class name that contains the following sequence of bytes: 0xC3 0xBD 0xC3 0xA4 0xC3 0xA8. These represent different sequences of characters in ISO 8859-15, ISO 8859-1 and UTF-8. The external, UTF-8-encoded stylesheet contains a selector <code>.test div.&#x00C3;&#x0153;&#x00C3;&#x20AC;&#x00C3;&#x0161;</code>. This matches the sequence of bytes above when they are interpreted as ISO 8859-15. If the class name matches the selector then the test will pass.</p><p>The only character encoding declaration for this HTML file is in the HTTP header, which sets the encoding to ISO 8859-15.</p></p>
</div>
</div>
<div class="nexttest"><div><a href="generate?test=the-input-byte-stream-003">Next test</a></div><div class="doctype">HTML5</div>
<p class="jump">the-input-byte-stream-001<br /><a href="/International/tests/html5/the-input-byte-stream/results-basics#basics" target="_blank">Result summary &amp; related tests</a><br /><a href="http://w3c-test.org/framework/details/i18n-html5/the-input-byte-stream-001" target="_blank">Detailed results for this test</a><br/> <a href="http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream" target="_blank">Link to spec</a></p>
<div class='prereq'>Assumptions: <ul><li>The default encoding for the browser you are testing is not set to ISO 8859-15.</li>
<li>The test is read from a server that supports HTTP.</li></ul></div>
</div>
<script>
test(function() {
assert_equals(document.getElementById('box').offsetWidth, 100);
}, " ");
</script>
</body>
</html>

View File

@@ -0,0 +1,48 @@
<!DOCTYPE html>
<html lang="en" >
<head>
<title>HTTP vs UTF-8 BOM</title>
<link rel='author' title='Richard Ishida' href='mailto:ishida@w3.org'>
<link rel='help' href='http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream'>
<link rel="stylesheet" type="text/css" href="./generatedtests.css">
<script src="http://w3c-test.org/resources/testharness.js"></script>
<script src="http://w3c-test.org/resources/testharnessreport.js"></script>
<meta name='flags' content='http'>
<meta name="assert" content="A character encoding set in the HTTP header has lower precedence than the UTF-8 signature.">
<style type='text/css'>
.test div { width: 50px; }</style>
<link rel="stylesheet" type="text/css" href="the-input-byte-stream/support/encodingtests-utf8.css">
</head>
<body>
<p class='title'>HTTP vs UTF-8 BOM</p>
<div id='log'></div>
<div class='test'><div id='box' class='ýäè'>&#xA0;</div></div>
<div class='description'>
<p class="assertion" title="Assertion">A character encoding set in the HTTP header has lower precedence than the UTF-8 signature.</p>
<div class="notes"><p><p>The HTTP header attempts to set the character encoding to ISO 8859-15. The page starts with a UTF-8 signature.</p><p>The test contains a div with a class name that contains the following sequence of bytes: 0xC3 0xBD 0xC3 0xA4 0xC3 0xA8. These represent different sequences of characters in ISO 8859-15, ISO 8859-1 and UTF-8. The external, UTF-8-encoded stylesheet contains a selector <code>.test div.&#x00FD;&#x00E4;&#x00E8;</code>. This matches the sequence of bytes above when they are interpreted as UTF-8. If the class name matches the selector then the test will pass.</p><p>If the test is unsuccessful, the characters &#x00EF;&#x00BB;&#x00BF; should appear at the top of the page. These represent the bytes that make up the UTF-8 signature when encountered in the ISO 8859-15 encoding.</p></p>
</div>
</div>
<div class="nexttest"><div><a href="generate?test=the-input-byte-stream-022">Next test</a></div><div class="doctype">HTML5</div>
<p class="jump">the-input-byte-stream-034<br /><a href="/International/tests/html5/the-input-byte-stream/results-basics#precedence" target="_blank">Result summary &amp; related tests</a><br /><a href="http://w3c-test.org/framework/details/i18n-html5/the-input-byte-stream-034" target="_blank">Detailed results for this test</a><br/> <a href="http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream" target="_blank">Link to spec</a></p>
<div class='prereq'>Assumptions: <ul><li>The default encoding for the browser you are testing is not set to ISO 8859-15.</li>
<li>The test is read from a server that supports HTTP.</li></ul></div>
</div>
<script>
test(function() {
assert_equals(document.getElementById('box').offsetWidth, 100);
}, " ");
</script>
</body>
</html>

View File

@@ -0,0 +1,49 @@
<!DOCTYPE html>
<html lang="en" >
<head>
<meta charset="iso-8859-1" > <title>HTTP vs meta charset</title>
<link rel='author' title='Richard Ishida' href='mailto:ishida@w3.org'>
<link rel='help' href='http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream'>
<link rel="stylesheet" type="text/css" href="./generatedtests.css">
<script src="http://w3c-test.org/resources/testharness.js"></script>
<script src="http://w3c-test.org/resources/testharnessreport.js"></script>
<meta name='flags' content='http'>
<meta name="assert" content="The HTTP header has a higher precedence than an encoding declaration in a meta charset attribute.">
<style type='text/css'>
.test div { width: 50px; }.test div { width: 90px; }
</style>
<link rel="stylesheet" type="text/css" href="the-input-byte-stream/support/encodingtests-15.css">
</head>
<body>
<p class='title'>HTTP vs meta charset</p>
<div id='log'></div>
<div class='test'><div id='box' class='ýäè'>&#xA0;</div></div>
<div class='description'>
<p class="assertion" title="Assertion">The HTTP header has a higher precedence than an encoding declaration in a meta charset attribute.</p>
<div class="notes"><p><p>The HTTP header attempts to set the character encoding to ISO 8859-15. The page contains an encoding declaration in a meta charset attribute that attempts to set the character encoding to ISO 8859-1.</p><p>The test contains a div with a class name that contains the following sequence of bytes: 0xC3 0xBD 0xC3 0xA4 0xC3 0xA8. These represent different sequences of characters in ISO 8859-15, ISO 8859-1 and UTF-8. The external, UTF-8-encoded stylesheet contains a selector <code>.test div.&#x00C3;&#x0153;&#x00C3;&#x20AC;&#x00C3;&#x0161;</code>. This matches the sequence of bytes above when they are interpreted as ISO 8859-15. If the class name matches the selector then the test will pass.</p></p>
</div>
</div>
<div class="nexttest"><div><a href="generate?test=the-input-byte-stream-037">Next test</a></div><div class="doctype">HTML5</div>
<p class="jump">the-input-byte-stream-018<br /><a href="/International/tests/html5/the-input-byte-stream/results-basics#precedence" target="_blank">Result summary &amp; related tests</a><br /><a href="http://w3c-test.org/framework/details/i18n-html5/the-input-byte-stream-018" target="_blank">Detailed results for this test</a><br/> <a href="http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream" target="_blank">Link to spec</a></p>
<div class='prereq'>Assumptions: <ul><li>The default encoding for the browser you are testing is not set to ISO 8859-15.</li>
<li>The test is read from a server that supports HTTP.</li></ul></div>
</div>
<script>
test(function() {
assert_equals(document.getElementById('box').offsetWidth, 100);
}, " ");
</script>
</body>
</html>

View File

@@ -0,0 +1,49 @@
<!DOCTYPE html>
<html lang="en" >
<head>
<meta http-equiv="content-type" content="text/html;charset=iso-8859-1" > <title>HTTP vs meta content</title>
<link rel='author' title='Richard Ishida' href='mailto:ishida@w3.org'>
<link rel='help' href='http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream'>
<link rel="stylesheet" type="text/css" href="./generatedtests.css">
<script src="http://w3c-test.org/resources/testharness.js"></script>
<script src="http://w3c-test.org/resources/testharnessreport.js"></script>
<meta name='flags' content='http'>
<meta name="assert" content="The HTTP header has a higher precedence than an encoding declaration in a meta content attribute.">
<style type='text/css'>
.test div { width: 50px; }.test div { width: 90px; }
</style>
<link rel="stylesheet" type="text/css" href="the-input-byte-stream/support/encodingtests-15.css">
</head>
<body>
<p class='title'>HTTP vs meta content</p>
<div id='log'></div>
<div class='test'><div id='box' class='ýäè'>&#xA0;</div></div>
<div class='description'>
<p class="assertion" title="Assertion">The HTTP header has a higher precedence than an encoding declaration in a meta content attribute.</p>
<div class="notes"><p><p>The HTTP header attempts to set the character encoding to ISO 8859-15. The page contains an encoding declaration in a meta content attribute that attempts to set the character encoding to ISO 8859-1.</p><p>The test contains a div with a class name that contains the following sequence of bytes: 0xC3 0xBD 0xC3 0xA4 0xC3 0xA8. These represent different sequences of characters in ISO 8859-15, ISO 8859-1 and UTF-8. The external, UTF-8-encoded stylesheet contains a selector <code>.test div.&#x00C3;&#x0153;&#x00C3;&#x20AC;&#x00C3;&#x0161;</code>. This matches the sequence of bytes above when they are interpreted as ISO 8859-15. If the class name matches the selector then the test will pass.</p></p>
</div>
</div>
<div class="nexttest"><div><a href="generate?test=the-input-byte-stream-018">Next test</a></div><div class="doctype">HTML5</div>
<p class="jump">the-input-byte-stream-016<br /><a href="/International/tests/html5/the-input-byte-stream/results-basics#precedence" target="_blank">Result summary &amp; related tests</a><br /><a href="http://w3c-test.org/framework/details/i18n-html5/the-input-byte-stream-016" target="_blank">Detailed results for this test</a><br/> <a href="http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream" target="_blank">Link to spec</a></p>
<div class='prereq'>Assumptions: <ul><li>The default encoding for the browser you are testing is not set to ISO 8859-15.</li>
<li>The test is read from a server that supports HTTP.</li></ul></div>
</div>
<script>
test(function() {
assert_equals(document.getElementById('box').offsetWidth, 100);
}, " ");
</script>
</body>
</html>

View File

@@ -0,0 +1,47 @@
<!DOCTYPE html>
<html lang="en" >
<head>
<title>No encoding declaration</title>
<link rel='author' title='Richard Ishida' href='mailto:ishida@w3.org'>
<link rel='help' href='http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream'>
<link rel="stylesheet" type="text/css" href="./generatedtests.css">
<script src="http://w3c-test.org/resources/testharness.js"></script>
<script src="http://w3c-test.org/resources/testharnessreport.js"></script>
<meta name='flags' content='http'>
<meta name="assert" content="A page with no encoding information in HTTP, BOM, XML declaration or meta element will be treated as UTF-8.">
<style type='text/css'>
.test div { width: 50px; }</style>
<link rel="stylesheet" type="text/css" href="the-input-byte-stream/support/encodingtests-utf8.css">
</head>
<body>
<p class='title'>No encoding declaration</p>
<div id='log'></div>
<div class='test'><div id='box' class='ýäè'>&#xA0;</div></div>
<div class='description'>
<p class="assertion" title="Assertion">A page with no encoding information in HTTP, BOM, XML declaration or meta element will be treated as UTF-8.</p>
<div class="notes"><p><p>The test on this page contains a div with a class name that contains the following sequence of bytes: 0xC3 0xBD 0xC3 0xA4 0xC3 0xA8. These represent different sequences of characters in ISO 8859-15, ISO 8859-1 and UTF-8. The external, UTF-8-encoded stylesheet contains a selector <code>.test div.&#x00FD;&#x00E4;&#x00E8;</code>. This matches the sequence of bytes above when they are interpreted as UTF-8. If the class name matches the selector then the test will pass.</p></p>
</div>
</div>
<div class="nexttest"><div><a href="generate?test=the-input-byte-stream-034">Next test</a></div><div class="doctype">HTML5</div>
<p class="jump">the-input-byte-stream-015<br /><a href="/International/tests/html5/the-input-byte-stream/results-basics#basics" target="_blank">Result summary &amp; related tests</a><br /><a href="http://w3c-test.org/framework/details/i18n-html5/the-input-byte-stream-015" target="_blank">Detailed results for this test</a><br/> <a href="http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream" target="_blank">Link to spec</a></p>
<div class='prereq'>Assumptions: <ul><li>The test is read from a server that supports HTTP.</li></ul></div>
</div>
<script>
test(function() {
assert_equals(document.getElementById('box').offsetWidth, 100);
}, " ");
</script>
</body>
</html>

View File

@@ -0,0 +1 @@
These test cases come from http://www.w3.org/International/tests/html5/the-input-byte-stream/results-basics

View File

Binary file not shown.

View File

Binary file not shown.

View File

@@ -0,0 +1,49 @@
<!DOCTYPE html>
<html lang="en" >
<head>
<meta charset="iso-8859-15"> <title>UTF-8 BOM vs meta charset</title>
<link rel='author' title='Richard Ishida' href='mailto:ishida@w3.org'>
<link rel='help' href='http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream'>
<link rel="stylesheet" type="text/css" href="./generatedtests.css">
<script src="http://w3c-test.org/resources/testharness.js"></script>
<script src="http://w3c-test.org/resources/testharnessreport.js"></script>
<meta name='flags' content='http'>
<meta name="assert" content="A page with a UTF-8 BOM will be recognized as UTF-8 even if the meta charset attribute declares a different encoding.">
<style type='text/css'>
.test div { width: 50px; }.test div { width: 90px; }
</style>
<link rel="stylesheet" type="text/css" href="the-input-byte-stream/support/encodingtests-utf8.css">
</head>
<body>
<p class='title'>UTF-8 BOM vs meta charset</p>
<div id='log'></div>
<div class='test'><div id='box' class='ýäè'>&#xA0;</div></div>
<div class='description'>
<p class="assertion" title="Assertion">A page with a UTF-8 BOM will be recognized as UTF-8 even if the meta charset attribute declares a different encoding.</p>
<div class="notes"><p><p>The page contains an encoding declaration in a meta charset attribute that attempts to set the character encoding to ISO 8859-15, but the file starts with a UTF-8 signature.</p><p>The test contains a div with a class name that contains the following sequence of bytes: 0xC3 0xBD 0xC3 0xA4 0xC3 0xA8. These represent different sequences of characters in ISO 8859-15, ISO 8859-1 and UTF-8. The external, UTF-8-encoded stylesheet contains a selector <code>.test div.&#x00FD;&#x00E4;&#x00E8;</code>. This matches the sequence of bytes above when they are interpreted as UTF-8. If the class name matches the selector then the test will pass.</p></p>
</div>
</div>
<div class="nexttest"><div><a href="generate?test=the-input-byte-stream-024">Next test</a></div><div class="doctype">HTML5</div>
<p class="jump">the-input-byte-stream-038<br /><a href="/International/tests/html5/the-input-byte-stream/results-basics#precedence" target="_blank">Result summary &amp; related tests</a><br /><a href="http://w3c-test.org/framework/details/i18n-html5/the-input-byte-stream-038" target="_blank">Detailed results for this test</a><br/> <a href="http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream" target="_blank">Link to spec</a></p>
<div class='prereq'>Assumptions: <ul><li>The default encoding for the browser you are testing is not set to ISO 8859-15.</li>
<li>The test is read from a server that supports HTTP.</li></ul></div>
</div>
<script>
test(function() {
assert_equals(document.getElementById('box').offsetWidth, 100);
}, " ");
</script>
</body>
</html>

View File

@@ -0,0 +1,48 @@
<!DOCTYPE html>
<html lang="en" >
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-15"> <title>UTF-8 BOM vs meta content</title>
<link rel='author' title='Richard Ishida' href='mailto:ishida@w3.org'>
<link rel='help' href='http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream'>
<link rel="stylesheet" type="text/css" href="./generatedtests.css">
<script src="http://w3c-test.org/resources/testharness.js"></script>
<script src="http://w3c-test.org/resources/testharnessreport.js"></script>
<meta name='flags' content='http'>
<meta name="assert" content="A page with a UTF-8 BOM will be recognized as UTF-8 even if the meta content attribute declares a different encoding.">
<style type='text/css'>
.test div { width: 50px; }</style>
<link rel="stylesheet" type="text/css" href="the-input-byte-stream/support/encodingtests-utf8.css">
</head>
<body>
<p class='title'>UTF-8 BOM vs meta content</p>
<div id='log'></div>
<div class='test'><div id='box' class='ýäè'>&#xA0;</div></div>
<div class='description'>
<p class="assertion" title="Assertion">A page with a UTF-8 BOM will be recognized as UTF-8 even if the meta content attribute declares a different encoding.</p>
<div class="notes"><p><p>The page contains an encoding declaration in a meta content attribute that attempts to set the character encoding to ISO 8859-15, but the file starts with a UTF-8 signature.</p><p>The test contains a div with a class name that contains the following sequence of bytes: 0xC3 0xBD 0xC3 0xA4 0xC3 0xA8. These represent different sequences of characters in ISO 8859-15, ISO 8859-1 and UTF-8. The external, UTF-8-encoded stylesheet contains a selector <code>.test div.&#x00FD;&#x00E4;&#x00E8;</code>. This matches the sequence of bytes above when they are interpreted as UTF-8. If the class name matches the selector then the test will pass.</p></p>
</div>
</div>
<div class="nexttest"><div><a href="generate?test=the-input-byte-stream-038">Next test</a></div><div class="doctype">HTML5</div>
<p class="jump">the-input-byte-stream-037<br /><a href="/International/tests/html5/the-input-byte-stream/results-basics#precedence" target="_blank">Result summary &amp; related tests</a><br /><a href="http://w3c-test.org/framework/details/i18n-html5/the-input-byte-stream-037" target="_blank">Detailed results for this test</a><br/> <a href="http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream" target="_blank">Link to spec</a></p>
<div class='prereq'>Assumptions: <ul><li>The default encoding for the browser you are testing is not set to ISO 8859-15.</li>
<li>The test is read from a server that supports HTTP.</li></ul></div>
</div>
<script>
test(function() {
assert_equals(document.getElementById('box').offsetWidth, 100);
}, " ");
</script>
</body>
</html>

View File

@@ -0,0 +1,48 @@
<!DOCTYPE html>
<html lang="en" >
<head>
<meta charset="iso-8859-15"> <title>meta charset attribute</title>
<link rel='author' title='Richard Ishida' href='mailto:ishida@w3.org'>
<link rel='help' href='http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream'>
<link rel="stylesheet" type="text/css" href="./generatedtests.css">
<script src="http://w3c-test.org/resources/testharness.js"></script>
<script src="http://w3c-test.org/resources/testharnessreport.js"></script>
<meta name='flags' content='http'>
<meta name="assert" content="The character encoding of the page can be set by a meta element with charset attribute.">
<style type='text/css'>
.test div { width: 50px; }</style>
<link rel="stylesheet" type="text/css" href="the-input-byte-stream/support/encodingtests-15.css">
</head>
<body>
<p class='title'>meta charset attribute</p>
<div id='log'></div>
<div class='test'><div id='box' class='ýäè'>&#xA0;</div></div>
<div class='description'>
<p class="assertion" title="Assertion">The character encoding of the page can be set by a meta element with charset attribute.</p>
<div class="notes"><p><p>The only character encoding declaration for this HTML file is in the charset attribute of the meta element, which declares the encoding to be ISO 8859-15.</p><p>The test contains a div with a class name that contains the following sequence of bytes: 0xC3 0xBD 0xC3 0xA4 0xC3 0xA8. These represent different sequences of characters in ISO 8859-15, ISO 8859-1 and UTF-8. The external, UTF-8-encoded stylesheet contains a selector <code>.test div.&#x00C3;&#x0153;&#x00C3;&#x20AC;&#x00C3;&#x0161;</code>. This matches the sequence of bytes above when they are interpreted as ISO 8859-15. If the class name matches the selector then the test will pass.</p></p>
</div>
</div>
<div class="nexttest"><div><a href="generate?test=the-input-byte-stream-015">Next test</a></div><div class="doctype">HTML5</div>
<p class="jump">the-input-byte-stream-009<br /><a href="/International/tests/html5/the-input-byte-stream/results-basics#basics" target="_blank">Result summary &amp; related tests</a><br /><a href="http://w3c-test.org/framework/details/i18n-html5/the-input-byte-stream-009" target="_blank">Detailed results for this test</a><br/> <a href="http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream" target="_blank">Link to spec</a></p>
<div class='prereq'>Assumptions: <ul><li>The default encoding for the browser you are testing is not set to ISO 8859-15.</li>
<li>The test is read from a server that supports HTTP.</li></ul></div>
</div>
<script>
test(function() {
assert_equals(document.getElementById('box').offsetWidth, 100);
}, " ");
</script>
</body>
</html>

View File

@@ -0,0 +1,48 @@
<!DOCTYPE html>
<html lang="en" >
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-15"> <title>meta content attribute</title>
<link rel='author' title='Richard Ishida' href='mailto:ishida@w3.org'>
<link rel='help' href='http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream'>
<link rel="stylesheet" type="text/css" href="./generatedtests.css">
<script src="http://w3c-test.org/resources/testharness.js"></script>
<script src="http://w3c-test.org/resources/testharnessreport.js"></script>
<meta name='flags' content='http'>
<meta name="assert" content="The character encoding of the page can be set by a meta element with http-equiv and content attributes.">
<style type='text/css'>
.test div { width: 50px; }</style>
<link rel="stylesheet" type="text/css" href="the-input-byte-stream/support/encodingtests-15.css">
</head>
<body>
<p class='title'>meta content attribute</p>
<div id='log'></div>
<div class='test'><div id='box' class='ýäè'>&#xA0;</div></div>
<div class='description'>
<p class="assertion" title="Assertion">The character encoding of the page can be set by a meta element with http-equiv and content attributes.</p>
<div class="notes"><p><p>The only character encoding declaration for this HTML file is in the content attribute of the meta element, which declares the encoding to be ISO 8859-15.</p><p>The test contains a div with a class name that contains the following sequence of bytes: 0xC3 0xBD 0xC3 0xA4 0xC3 0xA8. These represent different sequences of characters in ISO 8859-15, ISO 8859-1 and UTF-8. The external, UTF-8-encoded stylesheet contains a selector <code>.test div.&#x00C3;&#x0153;&#x00C3;&#x20AC;&#x00C3;&#x0161;</code>. This matches the sequence of bytes above when they are interpreted as ISO 8859-15. If the class name matches the selector then the test will pass.</p></p>
</div>
</div>
<div class="nexttest"><div><a href="generate?test=the-input-byte-stream-009">Next test</a></div><div class="doctype">HTML5</div>
<p class="jump">the-input-byte-stream-007<br /><a href="/International/tests/html5/the-input-byte-stream/results-basics#basics" target="_blank">Result summary &amp; related tests</a><br /><a href="http://w3c-test.org/framework/details/i18n-html5/the-input-byte-stream-007" target="_blank">Detailed results for this test</a><br/> <a href="http://www.w3.org/TR/html5/syntax.html#the-input-byte-stream" target="_blank">Link to spec</a></p>
<div class='prereq'>Assumptions: <ul><li>The default encoding for the browser you are testing is not set to ISO 8859-15.</li>
<li>The test is read from a server that supports HTTP.</li></ul></div>
</div>
<script>
test(function() {
assert_equals(document.getElementById('box').offsetWidth, 100);
}, " ");
</script>
</body>
</html>

View File

@@ -0,0 +1,100 @@
// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
// Section 12.2.3.2 of the HTML5 specification says "The following elements
// have varying levels of special parsing rules".
// http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#the-stack-of-open-elements
var isSpecialElementMap = map[string]bool{
"address": true,
"applet": true,
"area": true,
"article": true,
"aside": true,
"base": true,
"basefont": true,
"bgsound": true,
"blockquote": true,
"body": true,
"br": true,
"button": true,
"caption": true,
"center": true,
"col": true,
"colgroup": true,
"command": true,
"dd": true,
"details": true,
"dir": true,
"div": true,
"dl": true,
"dt": true,
"embed": true,
"fieldset": true,
"figcaption": true,
"figure": true,
"footer": true,
"form": true,
"frame": true,
"frameset": true,
"h1": true,
"h2": true,
"h3": true,
"h4": true,
"h5": true,
"h6": true,
"head": true,
"header": true,
"hgroup": true,
"hr": true,
"html": true,
"iframe": true,
"img": true,
"input": true,
"isindex": true,
"li": true,
"link": true,
"listing": true,
"marquee": true,
"menu": true,
"meta": true,
"nav": true,
"noembed": true,
"noframes": true,
"noscript": true,
"object": true,
"ol": true,
"p": true,
"param": true,
"plaintext": true,
"pre": true,
"script": true,
"section": true,
"select": true,
"style": true,
"summary": true,
"table": true,
"tbody": true,
"td": true,
"textarea": true,
"tfoot": true,
"th": true,
"thead": true,
"title": true,
"tr": true,
"ul": true,
"wbr": true,
"xmp": true,
}
func isSpecialElement(element *Node) bool {
switch element.Namespace {
case "", "html":
return isSpecialElementMap[element.Data]
case "svg":
return element.Data == "foreignObject"
}
return false
}

View File

@@ -0,0 +1,106 @@
// Copyright 2010 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
/*
Package html implements an HTML5-compliant tokenizer and parser.
Tokenization is done by creating a Tokenizer for an io.Reader r. It is the
caller's responsibility to ensure that r provides UTF-8 encoded HTML.
z := html.NewTokenizer(r)
Given a Tokenizer z, the HTML is tokenized by repeatedly calling z.Next(),
which parses the next token and returns its type, or an error:
for {
tt := z.Next()
if tt == html.ErrorToken {
// ...
return ...
}
// Process the current token.
}
There are two APIs for retrieving the current token. The high-level API is to
call Token; the low-level API is to call Text or TagName / TagAttr. Both APIs
allow optionally calling Raw after Next but before Token, Text, TagName, or
TagAttr. In EBNF notation, the valid call sequence per token is:
Next {Raw} [ Token | Text | TagName {TagAttr} ]
Token returns an independent data structure that completely describes a token.
Entities (such as "&lt;") are unescaped, tag names and attribute keys are
lower-cased, and attributes are collected into a []Attribute. For example:
for {
if z.Next() == html.ErrorToken {
// Returning io.EOF indicates success.
return z.Err()
}
emitToken(z.Token())
}
The low-level API performs fewer allocations and copies, but the contents of
the []byte values returned by Text, TagName and TagAttr may change on the next
call to Next. For example, to extract an HTML page's anchor text:
depth := 0
for {
tt := z.Next()
switch tt {
case ErrorToken:
return z.Err()
case TextToken:
if depth > 0 {
// emitBytes should copy the []byte it receives,
// if it doesn't process it immediately.
emitBytes(z.Text())
}
case StartTagToken, EndTagToken:
tn, _ := z.TagName()
if len(tn) == 1 && tn[0] == 'a' {
if tt == StartTagToken {
depth++
} else {
depth--
}
}
}
}
Parsing is done by calling Parse with an io.Reader, which returns the root of
the parse tree (the document element) as a *Node. It is the caller's
responsibility to ensure that the Reader provides UTF-8 encoded HTML. For
example, to process each anchor node in depth-first order:
doc, err := html.Parse(r)
if err != nil {
// ...
}
var f func(*html.Node)
f = func(n *html.Node) {
if n.Type == html.ElementNode && n.Data == "a" {
// Do something with n...
}
for c := n.FirstChild; c != nil; c = c.NextSibling {
f(c)
}
}
f(doc)
The relevant specifications include:
http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html and
http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html
*/
package html
// The tokenization algorithm implemented by this package is not a line-by-line
// transliteration of the relatively verbose state-machine in the WHATWG
// specification. A more direct approach is used instead, where the program
// counter implies the state, such as whether it is tokenizing a tag or a text
// node. Specification compliance is verified by checking expected and actual
// outputs over a test suite rather than aiming for algorithmic fidelity.
// TODO(nigeltao): Does a DOM API belong in this package or a separate one?
// TODO(nigeltao): How does parsing interact with a JavaScript engine?

View File

@@ -0,0 +1,156 @@
// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
import (
"strings"
)
// parseDoctype parses the data from a DoctypeToken into a name,
// public identifier, and system identifier. It returns a Node whose Type
// is DoctypeNode, whose Data is the name, and which has attributes
// named "system" and "public" for the two identifiers if they were present.
// quirks is whether the document should be parsed in "quirks mode".
func parseDoctype(s string) (n *Node, quirks bool) {
n = &Node{Type: DoctypeNode}
// Find the name.
space := strings.IndexAny(s, whitespace)
if space == -1 {
space = len(s)
}
n.Data = s[:space]
// The comparison to "html" is case-sensitive.
if n.Data != "html" {
quirks = true
}
n.Data = strings.ToLower(n.Data)
s = strings.TrimLeft(s[space:], whitespace)
if len(s) < 6 {
// It can't start with "PUBLIC" or "SYSTEM".
// Ignore the rest of the string.
return n, quirks || s != ""
}
key := strings.ToLower(s[:6])
s = s[6:]
for key == "public" || key == "system" {
s = strings.TrimLeft(s, whitespace)
if s == "" {
break
}
quote := s[0]
if quote != '"' && quote != '\'' {
break
}
s = s[1:]
q := strings.IndexRune(s, rune(quote))
var id string
if q == -1 {
id = s
s = ""
} else {
id = s[:q]
s = s[q+1:]
}
n.Attr = append(n.Attr, Attribute{Key: key, Val: id})
if key == "public" {
key = "system"
} else {
key = ""
}
}
if key != "" || s != "" {
quirks = true
} else if len(n.Attr) > 0 {
if n.Attr[0].Key == "public" {
public := strings.ToLower(n.Attr[0].Val)
switch public {
case "-//w3o//dtd w3 html strict 3.0//en//", "-/w3d/dtd html 4.0 transitional/en", "html":
quirks = true
default:
for _, q := range quirkyIDs {
if strings.HasPrefix(public, q) {
quirks = true
break
}
}
}
// The following two public IDs only cause quirks mode if there is no system ID.
if len(n.Attr) == 1 && (strings.HasPrefix(public, "-//w3c//dtd html 4.01 frameset//") ||
strings.HasPrefix(public, "-//w3c//dtd html 4.01 transitional//")) {
quirks = true
}
}
if lastAttr := n.Attr[len(n.Attr)-1]; lastAttr.Key == "system" &&
strings.ToLower(lastAttr.Val) == "http://www.ibm.com/data/dtd/v11/ibmxhtml1-transitional.dtd" {
quirks = true
}
}
return n, quirks
}
// quirkyIDs is a list of public doctype identifiers that cause a document
// to be interpreted in quirks mode. The identifiers should be in lower case.
var quirkyIDs = []string{
"+//silmaril//dtd html pro v0r11 19970101//",
"-//advasoft ltd//dtd html 3.0 aswedit + extensions//",
"-//as//dtd html 3.0 aswedit + extensions//",
"-//ietf//dtd html 2.0 level 1//",
"-//ietf//dtd html 2.0 level 2//",
"-//ietf//dtd html 2.0 strict level 1//",
"-//ietf//dtd html 2.0 strict level 2//",
"-//ietf//dtd html 2.0 strict//",
"-//ietf//dtd html 2.0//",
"-//ietf//dtd html 2.1e//",
"-//ietf//dtd html 3.0//",
"-//ietf//dtd html 3.2 final//",
"-//ietf//dtd html 3.2//",
"-//ietf//dtd html 3//",
"-//ietf//dtd html level 0//",
"-//ietf//dtd html level 1//",
"-//ietf//dtd html level 2//",
"-//ietf//dtd html level 3//",
"-//ietf//dtd html strict level 0//",
"-//ietf//dtd html strict level 1//",
"-//ietf//dtd html strict level 2//",
"-//ietf//dtd html strict level 3//",
"-//ietf//dtd html strict//",
"-//ietf//dtd html//",
"-//metrius//dtd metrius presentational//",
"-//microsoft//dtd internet explorer 2.0 html strict//",
"-//microsoft//dtd internet explorer 2.0 html//",
"-//microsoft//dtd internet explorer 2.0 tables//",
"-//microsoft//dtd internet explorer 3.0 html strict//",
"-//microsoft//dtd internet explorer 3.0 html//",
"-//microsoft//dtd internet explorer 3.0 tables//",
"-//netscape comm. corp.//dtd html//",
"-//netscape comm. corp.//dtd strict html//",
"-//o'reilly and associates//dtd html 2.0//",
"-//o'reilly and associates//dtd html extended 1.0//",
"-//o'reilly and associates//dtd html extended relaxed 1.0//",
"-//softquad software//dtd hotmetal pro 6.0::19990601::extensions to html 4.0//",
"-//softquad//dtd hotmetal pro 4.0::19971010::extensions to html 4.0//",
"-//spyglass//dtd html 2.0 extended//",
"-//sq//dtd html 2.0 hotmetal + extensions//",
"-//sun microsystems corp.//dtd hotjava html//",
"-//sun microsystems corp.//dtd hotjava strict html//",
"-//w3c//dtd html 3 1995-03-24//",
"-//w3c//dtd html 3.2 draft//",
"-//w3c//dtd html 3.2 final//",
"-//w3c//dtd html 3.2//",
"-//w3c//dtd html 3.2s draft//",
"-//w3c//dtd html 4.0 frameset//",
"-//w3c//dtd html 4.0 transitional//",
"-//w3c//dtd html experimental 19960712//",
"-//w3c//dtd html experimental 970421//",
"-//w3c//dtd w3 html//",
"-//w3o//dtd w3 html 3.0//",
"-//webtechs//dtd mozilla html 2.0//",
"-//webtechs//dtd mozilla html//",
}

View File

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,29 @@
// Copyright 2010 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
import (
"testing"
"unicode/utf8"
)
func TestEntityLength(t *testing.T) {
// We verify that the length of UTF-8 encoding of each value is <= 1 + len(key).
// The +1 comes from the leading "&". This property implies that the length of
// unescaped text is <= the length of escaped text.
for k, v := range entity {
if 1+len(k) < utf8.RuneLen(v) {
t.Error("escaped entity &" + k + " is shorter than its UTF-8 encoding " + string(v))
}
if len(k) > longestEntityWithoutSemicolon && k[len(k)-1] != ';' {
t.Errorf("entity name %s is %d characters, but longestEntityWithoutSemicolon=%d", k, len(k), longestEntityWithoutSemicolon)
}
}
for k, v := range entity2 {
if 1+len(k) < utf8.RuneLen(v[0])+utf8.RuneLen(v[1]) {
t.Error("escaped entity &" + k + " is shorter than its UTF-8 encoding " + string(v[0]) + string(v[1]))
}
}
}

View File

@@ -0,0 +1,258 @@
// Copyright 2010 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
import (
"bytes"
"strings"
"unicode/utf8"
)
// These replacements permit compatibility with old numeric entities that
// assumed Windows-1252 encoding.
// http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#consume-a-character-reference
var replacementTable = [...]rune{
'\u20AC', // First entry is what 0x80 should be replaced with.
'\u0081',
'\u201A',
'\u0192',
'\u201E',
'\u2026',
'\u2020',
'\u2021',
'\u02C6',
'\u2030',
'\u0160',
'\u2039',
'\u0152',
'\u008D',
'\u017D',
'\u008F',
'\u0090',
'\u2018',
'\u2019',
'\u201C',
'\u201D',
'\u2022',
'\u2013',
'\u2014',
'\u02DC',
'\u2122',
'\u0161',
'\u203A',
'\u0153',
'\u009D',
'\u017E',
'\u0178', // Last entry is 0x9F.
// 0x00->'\uFFFD' is handled programmatically.
// 0x0D->'\u000D' is a no-op.
}
// unescapeEntity reads an entity like "&lt;" from b[src:] and writes the
// corresponding "<" to b[dst:], returning the incremented dst and src cursors.
// Precondition: b[src] == '&' && dst <= src.
// attribute should be true if parsing an attribute value.
func unescapeEntity(b []byte, dst, src int, attribute bool) (dst1, src1 int) {
// http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#consume-a-character-reference
// i starts at 1 because we already know that s[0] == '&'.
i, s := 1, b[src:]
if len(s) <= 1 {
b[dst] = b[src]
return dst + 1, src + 1
}
if s[i] == '#' {
if len(s) <= 3 { // We need to have at least "&#.".
b[dst] = b[src]
return dst + 1, src + 1
}
i++
c := s[i]
hex := false
if c == 'x' || c == 'X' {
hex = true
i++
}
x := '\x00'
for i < len(s) {
c = s[i]
i++
if hex {
if '0' <= c && c <= '9' {
x = 16*x + rune(c) - '0'
continue
} else if 'a' <= c && c <= 'f' {
x = 16*x + rune(c) - 'a' + 10
continue
} else if 'A' <= c && c <= 'F' {
x = 16*x + rune(c) - 'A' + 10
continue
}
} else if '0' <= c && c <= '9' {
x = 10*x + rune(c) - '0'
continue
}
if c != ';' {
i--
}
break
}
if i <= 3 { // No characters matched.
b[dst] = b[src]
return dst + 1, src + 1
}
if 0x80 <= x && x <= 0x9F {
// Replace characters from Windows-1252 with UTF-8 equivalents.
x = replacementTable[x-0x80]
} else if x == 0 || (0xD800 <= x && x <= 0xDFFF) || x > 0x10FFFF {
// Replace invalid characters with the replacement character.
x = '\uFFFD'
}
return dst + utf8.EncodeRune(b[dst:], x), src + i
}
// Consume the maximum number of characters possible, with the
// consumed characters matching one of the named references.
for i < len(s) {
c := s[i]
i++
// Lower-cased characters are more common in entities, so we check for them first.
if 'a' <= c && c <= 'z' || 'A' <= c && c <= 'Z' || '0' <= c && c <= '9' {
continue
}
if c != ';' {
i--
}
break
}
entityName := string(s[1:i])
if entityName == "" {
// No-op.
} else if attribute && entityName[len(entityName)-1] != ';' && len(s) > i && s[i] == '=' {
// No-op.
} else if x := entity[entityName]; x != 0 {
return dst + utf8.EncodeRune(b[dst:], x), src + i
} else if x := entity2[entityName]; x[0] != 0 {
dst1 := dst + utf8.EncodeRune(b[dst:], x[0])
return dst1 + utf8.EncodeRune(b[dst1:], x[1]), src + i
} else if !attribute {
maxLen := len(entityName) - 1
if maxLen > longestEntityWithoutSemicolon {
maxLen = longestEntityWithoutSemicolon
}
for j := maxLen; j > 1; j-- {
if x := entity[entityName[:j]]; x != 0 {
return dst + utf8.EncodeRune(b[dst:], x), src + j + 1
}
}
}
dst1, src1 = dst+i, src+i
copy(b[dst:dst1], b[src:src1])
return dst1, src1
}
// unescape unescapes b's entities in-place, so that "a&lt;b" becomes "a<b".
// attribute should be true if parsing an attribute value.
func unescape(b []byte, attribute bool) []byte {
for i, c := range b {
if c == '&' {
dst, src := unescapeEntity(b, i, i, attribute)
for src < len(b) {
c := b[src]
if c == '&' {
dst, src = unescapeEntity(b, dst, src, attribute)
} else {
b[dst] = c
dst, src = dst+1, src+1
}
}
return b[0:dst]
}
}
return b
}
// lower lower-cases the A-Z bytes in b in-place, so that "aBc" becomes "abc".
func lower(b []byte) []byte {
for i, c := range b {
if 'A' <= c && c <= 'Z' {
b[i] = c + 'a' - 'A'
}
}
return b
}
const escapedChars = "&'<>\"\r"
func escape(w writer, s string) error {
i := strings.IndexAny(s, escapedChars)
for i != -1 {
if _, err := w.WriteString(s[:i]); err != nil {
return err
}
var esc string
switch s[i] {
case '&':
esc = "&amp;"
case '\'':
// "&#39;" is shorter than "&apos;" and apos was not in HTML until HTML5.
esc = "&#39;"
case '<':
esc = "&lt;"
case '>':
esc = "&gt;"
case '"':
// "&#34;" is shorter than "&quot;".
esc = "&#34;"
case '\r':
esc = "&#13;"
default:
panic("unrecognized escape character")
}
s = s[i+1:]
if _, err := w.WriteString(esc); err != nil {
return err
}
i = strings.IndexAny(s, escapedChars)
}
_, err := w.WriteString(s)
return err
}
// EscapeString escapes special characters like "<" to become "&lt;". It
// escapes only five such characters: <, >, &, ' and ".
// UnescapeString(EscapeString(s)) == s always holds, but the converse isn't
// always true.
func EscapeString(s string) string {
if strings.IndexAny(s, escapedChars) == -1 {
return s
}
var buf bytes.Buffer
escape(&buf, s)
return buf.String()
}
// UnescapeString unescapes entities like "&lt;" to become "<". It unescapes a
// larger range of entities than EscapeString escapes. For example, "&aacute;"
// unescapes to "á", as does "&#225;" and "&xE1;".
// UnescapeString(EscapeString(s)) == s always holds, but the converse isn't
// always true.
func UnescapeString(s string) string {
for _, c := range s {
if c == '&' {
return string(unescape([]byte(s), false))
}
}
return s
}

View File

@@ -0,0 +1,97 @@
// Copyright 2013 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
import "testing"
type unescapeTest struct {
// A short description of the test case.
desc string
// The HTML text.
html string
// The unescaped text.
unescaped string
}
var unescapeTests = []unescapeTest{
// Handle no entities.
{
"copy",
"A\ttext\nstring",
"A\ttext\nstring",
},
// Handle simple named entities.
{
"simple",
"&amp; &gt; &lt;",
"& > <",
},
// Handle hitting the end of the string.
{
"stringEnd",
"&amp &amp",
"& &",
},
// Handle entities with two codepoints.
{
"multiCodepoint",
"text &gesl; blah",
"text \u22db\ufe00 blah",
},
// Handle decimal numeric entities.
{
"decimalEntity",
"Delta = &#916; ",
"Delta = Δ ",
},
// Handle hexadecimal numeric entities.
{
"hexadecimalEntity",
"Lambda = &#x3bb; = &#X3Bb ",
"Lambda = λ = λ ",
},
// Handle numeric early termination.
{
"numericEnds",
"&# &#x &#128;43 &copy = &#169f = &#xa9",
"&# &#x €43 © = ©f = ©",
},
// Handle numeric ISO-8859-1 entity replacements.
{
"numericReplacements",
"Footnote&#x87;",
"Footnote‡",
},
}
func TestUnescape(t *testing.T) {
for _, tt := range unescapeTests {
unescaped := UnescapeString(tt.html)
if unescaped != tt.unescaped {
t.Errorf("TestUnescape %s: want %q, got %q", tt.desc, tt.unescaped, unescaped)
}
}
}
func TestUnescapeEscape(t *testing.T) {
ss := []string{
``,
`abc def`,
`a & b`,
`a&amp;b`,
`a &amp b`,
`&quot;`,
`"`,
`"<&>"`,
`&quot;&lt;&amp;&gt;&quot;`,
`3&5==1 && 0<1, "0&lt;1", a+acute=&aacute;`,
`The special characters are: <, >, &, ' and "`,
}
for _, s := range ss {
if got := UnescapeString(EscapeString(s)); got != s {
t.Errorf("got %q want %q", got, s)
}
}
}

View File

@@ -0,0 +1,40 @@
// Copyright 2012 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// This example demonstrates parsing HTML data and walking the resulting tree.
package html_test
import (
"fmt"
"log"
"strings"
"code.google.com/p/go.net/html"
)
func ExampleParse() {
s := `<p>Links:</p><ul><li><a href="foo">Foo</a><li><a href="/bar/baz">BarBaz</a></ul>`
doc, err := html.Parse(strings.NewReader(s))
if err != nil {
log.Fatal(err)
}
var f func(*html.Node)
f = func(n *html.Node) {
if n.Type == html.ElementNode && n.Data == "a" {
for _, a := range n.Attr {
if a.Key == "href" {
fmt.Println(a.Val)
break
}
}
}
for c := n.FirstChild; c != nil; c = c.NextSibling {
f(c)
}
}
f(doc)
// Output:
// foo
// /bar/baz
}

View File

@@ -0,0 +1,226 @@
// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
import (
"strings"
)
func adjustAttributeNames(aa []Attribute, nameMap map[string]string) {
for i := range aa {
if newName, ok := nameMap[aa[i].Key]; ok {
aa[i].Key = newName
}
}
}
func adjustForeignAttributes(aa []Attribute) {
for i, a := range aa {
if a.Key == "" || a.Key[0] != 'x' {
continue
}
switch a.Key {
case "xlink:actuate", "xlink:arcrole", "xlink:href", "xlink:role", "xlink:show",
"xlink:title", "xlink:type", "xml:base", "xml:lang", "xml:space", "xmlns:xlink":
j := strings.Index(a.Key, ":")
aa[i].Namespace = a.Key[:j]
aa[i].Key = a.Key[j+1:]
}
}
}
func htmlIntegrationPoint(n *Node) bool {
if n.Type != ElementNode {
return false
}
switch n.Namespace {
case "math":
if n.Data == "annotation-xml" {
for _, a := range n.Attr {
if a.Key == "encoding" {
val := strings.ToLower(a.Val)
if val == "text/html" || val == "application/xhtml+xml" {
return true
}
}
}
}
case "svg":
switch n.Data {
case "desc", "foreignObject", "title":
return true
}
}
return false
}
func mathMLTextIntegrationPoint(n *Node) bool {
if n.Namespace != "math" {
return false
}
switch n.Data {
case "mi", "mo", "mn", "ms", "mtext":
return true
}
return false
}
// Section 12.2.5.5.
var breakout = map[string]bool{
"b": true,
"big": true,
"blockquote": true,
"body": true,
"br": true,
"center": true,
"code": true,
"dd": true,
"div": true,
"dl": true,
"dt": true,
"em": true,
"embed": true,
"h1": true,
"h2": true,
"h3": true,
"h4": true,
"h5": true,
"h6": true,
"head": true,
"hr": true,
"i": true,
"img": true,
"li": true,
"listing": true,
"menu": true,
"meta": true,
"nobr": true,
"ol": true,
"p": true,
"pre": true,
"ruby": true,
"s": true,
"small": true,
"span": true,
"strong": true,
"strike": true,
"sub": true,
"sup": true,
"table": true,
"tt": true,
"u": true,
"ul": true,
"var": true,
}
// Section 12.2.5.5.
var svgTagNameAdjustments = map[string]string{
"altglyph": "altGlyph",
"altglyphdef": "altGlyphDef",
"altglyphitem": "altGlyphItem",
"animatecolor": "animateColor",
"animatemotion": "animateMotion",
"animatetransform": "animateTransform",
"clippath": "clipPath",
"feblend": "feBlend",
"fecolormatrix": "feColorMatrix",
"fecomponenttransfer": "feComponentTransfer",
"fecomposite": "feComposite",
"feconvolvematrix": "feConvolveMatrix",
"fediffuselighting": "feDiffuseLighting",
"fedisplacementmap": "feDisplacementMap",
"fedistantlight": "feDistantLight",
"feflood": "feFlood",
"fefunca": "feFuncA",
"fefuncb": "feFuncB",
"fefuncg": "feFuncG",
"fefuncr": "feFuncR",
"fegaussianblur": "feGaussianBlur",
"feimage": "feImage",
"femerge": "feMerge",
"femergenode": "feMergeNode",
"femorphology": "feMorphology",
"feoffset": "feOffset",
"fepointlight": "fePointLight",
"fespecularlighting": "feSpecularLighting",
"fespotlight": "feSpotLight",
"fetile": "feTile",
"feturbulence": "feTurbulence",
"foreignobject": "foreignObject",
"glyphref": "glyphRef",
"lineargradient": "linearGradient",
"radialgradient": "radialGradient",
"textpath": "textPath",
}
// Section 12.2.5.1
var mathMLAttributeAdjustments = map[string]string{
"definitionurl": "definitionURL",
}
var svgAttributeAdjustments = map[string]string{
"attributename": "attributeName",
"attributetype": "attributeType",
"basefrequency": "baseFrequency",
"baseprofile": "baseProfile",
"calcmode": "calcMode",
"clippathunits": "clipPathUnits",
"contentscripttype": "contentScriptType",
"contentstyletype": "contentStyleType",
"diffuseconstant": "diffuseConstant",
"edgemode": "edgeMode",
"externalresourcesrequired": "externalResourcesRequired",
"filterres": "filterRes",
"filterunits": "filterUnits",
"glyphref": "glyphRef",
"gradienttransform": "gradientTransform",
"gradientunits": "gradientUnits",
"kernelmatrix": "kernelMatrix",
"kernelunitlength": "kernelUnitLength",
"keypoints": "keyPoints",
"keysplines": "keySplines",
"keytimes": "keyTimes",
"lengthadjust": "lengthAdjust",
"limitingconeangle": "limitingConeAngle",
"markerheight": "markerHeight",
"markerunits": "markerUnits",
"markerwidth": "markerWidth",
"maskcontentunits": "maskContentUnits",
"maskunits": "maskUnits",
"numoctaves": "numOctaves",
"pathlength": "pathLength",
"patterncontentunits": "patternContentUnits",
"patterntransform": "patternTransform",
"patternunits": "patternUnits",
"pointsatx": "pointsAtX",
"pointsaty": "pointsAtY",
"pointsatz": "pointsAtZ",
"preservealpha": "preserveAlpha",
"preserveaspectratio": "preserveAspectRatio",
"primitiveunits": "primitiveUnits",
"refx": "refX",
"refy": "refY",
"repeatcount": "repeatCount",
"repeatdur": "repeatDur",
"requiredextensions": "requiredExtensions",
"requiredfeatures": "requiredFeatures",
"specularconstant": "specularConstant",
"specularexponent": "specularExponent",
"spreadmethod": "spreadMethod",
"startoffset": "startOffset",
"stddeviation": "stdDeviation",
"stitchtiles": "stitchTiles",
"surfacescale": "surfaceScale",
"systemlanguage": "systemLanguage",
"tablevalues": "tableValues",
"targetx": "targetX",
"targety": "targetY",
"textlength": "textLength",
"viewbox": "viewBox",
"viewtarget": "viewTarget",
"xchannelselector": "xChannelSelector",
"ychannelselector": "yChannelSelector",
"zoomandpan": "zoomAndPan",
}

View File

@@ -0,0 +1,193 @@
// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
import (
"code.google.com/p/go.net/html/atom"
)
// A NodeType is the type of a Node.
type NodeType uint32
const (
ErrorNode NodeType = iota
TextNode
DocumentNode
ElementNode
CommentNode
DoctypeNode
scopeMarkerNode
)
// Section 12.2.3.3 says "scope markers are inserted when entering applet
// elements, buttons, object elements, marquees, table cells, and table
// captions, and are used to prevent formatting from 'leaking'".
var scopeMarker = Node{Type: scopeMarkerNode}
// A Node consists of a NodeType and some Data (tag name for element nodes,
// content for text) and are part of a tree of Nodes. Element nodes may also
// have a Namespace and contain a slice of Attributes. Data is unescaped, so
// that it looks like "a<b" rather than "a&lt;b". For element nodes, DataAtom
// is the atom for Data, or zero if Data is not a known tag name.
//
// An empty Namespace implies a "http://www.w3.org/1999/xhtml" namespace.
// Similarly, "math" is short for "http://www.w3.org/1998/Math/MathML", and
// "svg" is short for "http://www.w3.org/2000/svg".
type Node struct {
Parent, FirstChild, LastChild, PrevSibling, NextSibling *Node
Type NodeType
DataAtom atom.Atom
Data string
Namespace string
Attr []Attribute
}
// InsertBefore inserts newChild as a child of n, immediately before oldChild
// in the sequence of n's children. oldChild may be nil, in which case newChild
// is appended to the end of n's children.
//
// It will panic if newChild already has a parent or siblings.
func (n *Node) InsertBefore(newChild, oldChild *Node) {
if newChild.Parent != nil || newChild.PrevSibling != nil || newChild.NextSibling != nil {
panic("html: InsertBefore called for an attached child Node")
}
var prev, next *Node
if oldChild != nil {
prev, next = oldChild.PrevSibling, oldChild
} else {
prev = n.LastChild
}
if prev != nil {
prev.NextSibling = newChild
} else {
n.FirstChild = newChild
}
if next != nil {
next.PrevSibling = newChild
} else {
n.LastChild = newChild
}
newChild.Parent = n
newChild.PrevSibling = prev
newChild.NextSibling = next
}
// AppendChild adds a node c as a child of n.
//
// It will panic if c already has a parent or siblings.
func (n *Node) AppendChild(c *Node) {
if c.Parent != nil || c.PrevSibling != nil || c.NextSibling != nil {
panic("html: AppendChild called for an attached child Node")
}
last := n.LastChild
if last != nil {
last.NextSibling = c
} else {
n.FirstChild = c
}
n.LastChild = c
c.Parent = n
c.PrevSibling = last
}
// RemoveChild removes a node c that is a child of n. Afterwards, c will have
// no parent and no siblings.
//
// It will panic if c's parent is not n.
func (n *Node) RemoveChild(c *Node) {
if c.Parent != n {
panic("html: RemoveChild called for a non-child Node")
}
if n.FirstChild == c {
n.FirstChild = c.NextSibling
}
if c.NextSibling != nil {
c.NextSibling.PrevSibling = c.PrevSibling
}
if n.LastChild == c {
n.LastChild = c.PrevSibling
}
if c.PrevSibling != nil {
c.PrevSibling.NextSibling = c.NextSibling
}
c.Parent = nil
c.PrevSibling = nil
c.NextSibling = nil
}
// reparentChildren reparents all of src's child nodes to dst.
func reparentChildren(dst, src *Node) {
for {
child := src.FirstChild
if child == nil {
break
}
src.RemoveChild(child)
dst.AppendChild(child)
}
}
// clone returns a new node with the same type, data and attributes.
// The clone has no parent, no siblings and no children.
func (n *Node) clone() *Node {
m := &Node{
Type: n.Type,
DataAtom: n.DataAtom,
Data: n.Data,
Attr: make([]Attribute, len(n.Attr)),
}
copy(m.Attr, n.Attr)
return m
}
// nodeStack is a stack of nodes.
type nodeStack []*Node
// pop pops the stack. It will panic if s is empty.
func (s *nodeStack) pop() *Node {
i := len(*s)
n := (*s)[i-1]
*s = (*s)[:i-1]
return n
}
// top returns the most recently pushed node, or nil if s is empty.
func (s *nodeStack) top() *Node {
if i := len(*s); i > 0 {
return (*s)[i-1]
}
return nil
}
// index returns the index of the top-most occurrence of n in the stack, or -1
// if n is not present.
func (s *nodeStack) index(n *Node) int {
for i := len(*s) - 1; i >= 0; i-- {
if (*s)[i] == n {
return i
}
}
return -1
}
// insert inserts a node at the given index.
func (s *nodeStack) insert(i int, n *Node) {
(*s) = append(*s, nil)
copy((*s)[i+1:], (*s)[i:])
(*s)[i] = n
}
// remove removes a node from the stack. It is a no-op if n is not present.
func (s *nodeStack) remove(n *Node) {
i := s.index(n)
if i == -1 {
return
}
copy((*s)[i:], (*s)[i+1:])
j := len(*s) - 1
(*s)[j] = nil
*s = (*s)[:j]
}

View File

@@ -0,0 +1,146 @@
// Copyright 2010 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
import (
"fmt"
)
// checkTreeConsistency checks that a node and its descendants are all
// consistent in their parent/child/sibling relationships.
func checkTreeConsistency(n *Node) error {
return checkTreeConsistency1(n, 0)
}
func checkTreeConsistency1(n *Node, depth int) error {
if depth == 1e4 {
return fmt.Errorf("html: tree looks like it contains a cycle")
}
if err := checkNodeConsistency(n); err != nil {
return err
}
for c := n.FirstChild; c != nil; c = c.NextSibling {
if err := checkTreeConsistency1(c, depth+1); err != nil {
return err
}
}
return nil
}
// checkNodeConsistency checks that a node's parent/child/sibling relationships
// are consistent.
func checkNodeConsistency(n *Node) error {
if n == nil {
return nil
}
nParent := 0
for p := n.Parent; p != nil; p = p.Parent {
nParent++
if nParent == 1e4 {
return fmt.Errorf("html: parent list looks like an infinite loop")
}
}
nForward := 0
for c := n.FirstChild; c != nil; c = c.NextSibling {
nForward++
if nForward == 1e6 {
return fmt.Errorf("html: forward list of children looks like an infinite loop")
}
if c.Parent != n {
return fmt.Errorf("html: inconsistent child/parent relationship")
}
}
nBackward := 0
for c := n.LastChild; c != nil; c = c.PrevSibling {
nBackward++
if nBackward == 1e6 {
return fmt.Errorf("html: backward list of children looks like an infinite loop")
}
if c.Parent != n {
return fmt.Errorf("html: inconsistent child/parent relationship")
}
}
if n.Parent != nil {
if n.Parent == n {
return fmt.Errorf("html: inconsistent parent relationship")
}
if n.Parent == n.FirstChild {
return fmt.Errorf("html: inconsistent parent/first relationship")
}
if n.Parent == n.LastChild {
return fmt.Errorf("html: inconsistent parent/last relationship")
}
if n.Parent == n.PrevSibling {
return fmt.Errorf("html: inconsistent parent/prev relationship")
}
if n.Parent == n.NextSibling {
return fmt.Errorf("html: inconsistent parent/next relationship")
}
parentHasNAsAChild := false
for c := n.Parent.FirstChild; c != nil; c = c.NextSibling {
if c == n {
parentHasNAsAChild = true
break
}
}
if !parentHasNAsAChild {
return fmt.Errorf("html: inconsistent parent/child relationship")
}
}
if n.PrevSibling != nil && n.PrevSibling.NextSibling != n {
return fmt.Errorf("html: inconsistent prev/next relationship")
}
if n.NextSibling != nil && n.NextSibling.PrevSibling != n {
return fmt.Errorf("html: inconsistent next/prev relationship")
}
if (n.FirstChild == nil) != (n.LastChild == nil) {
return fmt.Errorf("html: inconsistent first/last relationship")
}
if n.FirstChild != nil && n.FirstChild == n.LastChild {
// We have a sole child.
if n.FirstChild.PrevSibling != nil || n.FirstChild.NextSibling != nil {
return fmt.Errorf("html: inconsistent sole child's sibling relationship")
}
}
seen := map[*Node]bool{}
var last *Node
for c := n.FirstChild; c != nil; c = c.NextSibling {
if seen[c] {
return fmt.Errorf("html: inconsistent repeated child")
}
seen[c] = true
last = c
}
if last != n.LastChild {
return fmt.Errorf("html: inconsistent last relationship")
}
var first *Node
for c := n.LastChild; c != nil; c = c.PrevSibling {
if !seen[c] {
return fmt.Errorf("html: inconsistent missing child")
}
delete(seen, c)
first = c
}
if first != n.FirstChild {
return fmt.Errorf("html: inconsistent first relationship")
}
if len(seen) != 0 {
return fmt.Errorf("html: inconsistent forwards/backwards child list")
}
return nil
}

View File

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,388 @@
// Copyright 2010 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
import (
"bufio"
"bytes"
"errors"
"fmt"
"io"
"io/ioutil"
"os"
"path/filepath"
"runtime"
"sort"
"strings"
"testing"
"code.google.com/p/go.net/html/atom"
)
// readParseTest reads a single test case from r.
func readParseTest(r *bufio.Reader) (text, want, context string, err error) {
line, err := r.ReadSlice('\n')
if err != nil {
return "", "", "", err
}
var b []byte
// Read the HTML.
if string(line) != "#data\n" {
return "", "", "", fmt.Errorf(`got %q want "#data\n"`, line)
}
for {
line, err = r.ReadSlice('\n')
if err != nil {
return "", "", "", err
}
if line[0] == '#' {
break
}
b = append(b, line...)
}
text = strings.TrimSuffix(string(b), "\n")
b = b[:0]
// Skip the error list.
if string(line) != "#errors\n" {
return "", "", "", fmt.Errorf(`got %q want "#errors\n"`, line)
}
for {
line, err = r.ReadSlice('\n')
if err != nil {
return "", "", "", err
}
if line[0] == '#' {
break
}
}
if string(line) == "#document-fragment\n" {
line, err = r.ReadSlice('\n')
if err != nil {
return "", "", "", err
}
context = strings.TrimSpace(string(line))
line, err = r.ReadSlice('\n')
if err != nil {
return "", "", "", err
}
}
// Read the dump of what the parse tree should be.
if string(line) != "#document\n" {
return "", "", "", fmt.Errorf(`got %q want "#document\n"`, line)
}
inQuote := false
for {
line, err = r.ReadSlice('\n')
if err != nil && err != io.EOF {
return "", "", "", err
}
trimmed := bytes.Trim(line, "| \n")
if len(trimmed) > 0 {
if line[0] == '|' && trimmed[0] == '"' {
inQuote = true
}
if trimmed[len(trimmed)-1] == '"' && !(line[0] == '|' && len(trimmed) == 1) {
inQuote = false
}
}
if len(line) == 0 || len(line) == 1 && line[0] == '\n' && !inQuote {
break
}
b = append(b, line...)
}
return text, string(b), context, nil
}
func dumpIndent(w io.Writer, level int) {
io.WriteString(w, "| ")
for i := 0; i < level; i++ {
io.WriteString(w, " ")
}
}
type sortedAttributes []Attribute
func (a sortedAttributes) Len() int {
return len(a)
}
func (a sortedAttributes) Less(i, j int) bool {
if a[i].Namespace != a[j].Namespace {
return a[i].Namespace < a[j].Namespace
}
return a[i].Key < a[j].Key
}
func (a sortedAttributes) Swap(i, j int) {
a[i], a[j] = a[j], a[i]
}
func dumpLevel(w io.Writer, n *Node, level int) error {
dumpIndent(w, level)
switch n.Type {
case ErrorNode:
return errors.New("unexpected ErrorNode")
case DocumentNode:
return errors.New("unexpected DocumentNode")
case ElementNode:
if n.Namespace != "" {
fmt.Fprintf(w, "<%s %s>", n.Namespace, n.Data)
} else {
fmt.Fprintf(w, "<%s>", n.Data)
}
attr := sortedAttributes(n.Attr)
sort.Sort(attr)
for _, a := range attr {
io.WriteString(w, "\n")
dumpIndent(w, level+1)
if a.Namespace != "" {
fmt.Fprintf(w, `%s %s="%s"`, a.Namespace, a.Key, a.Val)
} else {
fmt.Fprintf(w, `%s="%s"`, a.Key, a.Val)
}
}
case TextNode:
fmt.Fprintf(w, `"%s"`, n.Data)
case CommentNode:
fmt.Fprintf(w, "<!-- %s -->", n.Data)
case DoctypeNode:
fmt.Fprintf(w, "<!DOCTYPE %s", n.Data)
if n.Attr != nil {
var p, s string
for _, a := range n.Attr {
switch a.Key {
case "public":
p = a.Val
case "system":
s = a.Val
}
}
if p != "" || s != "" {
fmt.Fprintf(w, ` "%s"`, p)
fmt.Fprintf(w, ` "%s"`, s)
}
}
io.WriteString(w, ">")
case scopeMarkerNode:
return errors.New("unexpected scopeMarkerNode")
default:
return errors.New("unknown node type")
}
io.WriteString(w, "\n")
for c := n.FirstChild; c != nil; c = c.NextSibling {
if err := dumpLevel(w, c, level+1); err != nil {
return err
}
}
return nil
}
func dump(n *Node) (string, error) {
if n == nil || n.FirstChild == nil {
return "", nil
}
var b bytes.Buffer
for c := n.FirstChild; c != nil; c = c.NextSibling {
if err := dumpLevel(&b, c, 0); err != nil {
return "", err
}
}
return b.String(), nil
}
const testDataDir = "testdata/webkit/"
func TestParser(t *testing.T) {
testFiles, err := filepath.Glob(testDataDir + "*.dat")
if err != nil {
t.Fatal(err)
}
for _, tf := range testFiles {
f, err := os.Open(tf)
if err != nil {
t.Fatal(err)
}
defer f.Close()
r := bufio.NewReader(f)
for i := 0; ; i++ {
text, want, context, err := readParseTest(r)
if err == io.EOF {
break
}
if err != nil {
t.Fatal(err)
}
err = testParseCase(text, want, context)
if err != nil {
t.Errorf("%s test #%d %q, %s", tf, i, text, err)
}
}
}
}
// testParseCase tests one test case from the test files. If the test does not
// pass, it returns an error that explains the failure.
// text is the HTML to be parsed, want is a dump of the correct parse tree,
// and context is the name of the context node, if any.
func testParseCase(text, want, context string) (err error) {
defer func() {
if x := recover(); x != nil {
switch e := x.(type) {
case error:
err = e
default:
err = fmt.Errorf("%v", e)
}
}
}()
var doc *Node
if context == "" {
doc, err = Parse(strings.NewReader(text))
if err != nil {
return err
}
} else {
contextNode := &Node{
Type: ElementNode,
DataAtom: atom.Lookup([]byte(context)),
Data: context,
}
nodes, err := ParseFragment(strings.NewReader(text), contextNode)
if err != nil {
return err
}
doc = &Node{
Type: DocumentNode,
}
for _, n := range nodes {
doc.AppendChild(n)
}
}
if err := checkTreeConsistency(doc); err != nil {
return err
}
got, err := dump(doc)
if err != nil {
return err
}
// Compare the parsed tree to the #document section.
if got != want {
return fmt.Errorf("got vs want:\n----\n%s----\n%s----", got, want)
}
if renderTestBlacklist[text] || context != "" {
return nil
}
// Check that rendering and re-parsing results in an identical tree.
pr, pw := io.Pipe()
go func() {
pw.CloseWithError(Render(pw, doc))
}()
doc1, err := Parse(pr)
if err != nil {
return err
}
got1, err := dump(doc1)
if err != nil {
return err
}
if got != got1 {
return fmt.Errorf("got vs got1:\n----\n%s----\n%s----", got, got1)
}
return nil
}
// Some test input result in parse trees are not 'well-formed' despite
// following the HTML5 recovery algorithms. Rendering and re-parsing such a
// tree will not result in an exact clone of that tree. We blacklist such
// inputs from the render test.
var renderTestBlacklist = map[string]bool{
// The second <a> will be reparented to the first <table>'s parent. This
// results in an <a> whose parent is an <a>, which is not 'well-formed'.
`<a><table><td><a><table></table><a></tr><a></table><b>X</b>C<a>Y`: true,
// The same thing with a <p>:
`<p><table></p>`: true,
// More cases of <a> being reparented:
`<a href="blah">aba<table><a href="foo">br<tr><td></td></tr>x</table>aoe`: true,
`<a><table><a></table><p><a><div><a>`: true,
`<a><table><td><a><table></table><a></tr><a></table><a>`: true,
// A similar reparenting situation involving <nobr>:
`<!DOCTYPE html><body><b><nobr>1<table><nobr></b><i><nobr>2<nobr></i>3`: true,
// A <plaintext> element is reparented, putting it before a table.
// A <plaintext> element can't have anything after it in HTML.
`<table><plaintext><td>`: true,
`<!doctype html><table><plaintext></plaintext>`: true,
`<!doctype html><table><tbody><plaintext></plaintext>`: true,
`<!doctype html><table><tbody><tr><plaintext></plaintext>`: true,
// A form inside a table inside a form doesn't work either.
`<!doctype html><form><table></form><form></table></form>`: true,
// A script that ends at EOF may escape its own closing tag when rendered.
`<!doctype html><script><!--<script `: true,
`<!doctype html><script><!--<script <`: true,
`<!doctype html><script><!--<script <a`: true,
`<!doctype html><script><!--<script </`: true,
`<!doctype html><script><!--<script </s`: true,
`<!doctype html><script><!--<script </script`: true,
`<!doctype html><script><!--<script </scripta`: true,
`<!doctype html><script><!--<script -`: true,
`<!doctype html><script><!--<script -a`: true,
`<!doctype html><script><!--<script -<`: true,
`<!doctype html><script><!--<script --`: true,
`<!doctype html><script><!--<script --a`: true,
`<!doctype html><script><!--<script --<`: true,
`<script><!--<script `: true,
`<script><!--<script <a`: true,
`<script><!--<script </script`: true,
`<script><!--<script </scripta`: true,
`<script><!--<script -`: true,
`<script><!--<script -a`: true,
`<script><!--<script --`: true,
`<script><!--<script --a`: true,
`<script><!--<script <`: true,
`<script><!--<script </`: true,
`<script><!--<script </s`: true,
// Reconstructing the active formatting elements results in a <plaintext>
// element that contains an <a> element.
`<!doctype html><p><a><plaintext>b`: true,
}
func TestNodeConsistency(t *testing.T) {
// inconsistentNode is a Node whose DataAtom and Data do not agree.
inconsistentNode := &Node{
Type: ElementNode,
DataAtom: atom.Frameset,
Data: "table",
}
_, err := ParseFragment(strings.NewReader("<p>hello</p>"), inconsistentNode)
if err == nil {
t.Errorf("got nil error, want non-nil")
}
}
func BenchmarkParser(b *testing.B) {
buf, err := ioutil.ReadFile("testdata/go1.html")
if err != nil {
b.Fatalf("could not read testdata/go1.html: %v", err)
}
b.SetBytes(int64(len(buf)))
runtime.GC()
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
Parse(bytes.NewBuffer(buf))
}
}

View File

@@ -0,0 +1,271 @@
// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
import (
"bufio"
"errors"
"fmt"
"io"
"strings"
)
type writer interface {
io.Writer
WriteByte(c byte) error // in Go 1.1, use io.ByteWriter
WriteString(string) (int, error)
}
// Render renders the parse tree n to the given writer.
//
// Rendering is done on a 'best effort' basis: calling Parse on the output of
// Render will always result in something similar to the original tree, but it
// is not necessarily an exact clone unless the original tree was 'well-formed'.
// 'Well-formed' is not easily specified; the HTML5 specification is
// complicated.
//
// Calling Parse on arbitrary input typically results in a 'well-formed' parse
// tree. However, it is possible for Parse to yield a 'badly-formed' parse tree.
// For example, in a 'well-formed' parse tree, no <a> element is a child of
// another <a> element: parsing "<a><a>" results in two sibling elements.
// Similarly, in a 'well-formed' parse tree, no <a> element is a child of a
// <table> element: parsing "<p><table><a>" results in a <p> with two sibling
// children; the <a> is reparented to the <table>'s parent. However, calling
// Parse on "<a><table><a>" does not return an error, but the result has an <a>
// element with an <a> child, and is therefore not 'well-formed'.
//
// Programmatically constructed trees are typically also 'well-formed', but it
// is possible to construct a tree that looks innocuous but, when rendered and
// re-parsed, results in a different tree. A simple example is that a solitary
// text node would become a tree containing <html>, <head> and <body> elements.
// Another example is that the programmatic equivalent of "a<head>b</head>c"
// becomes "<html><head><head/><body>abc</body></html>".
func Render(w io.Writer, n *Node) error {
if x, ok := w.(writer); ok {
return render(x, n)
}
buf := bufio.NewWriter(w)
if err := render(buf, n); err != nil {
return err
}
return buf.Flush()
}
// plaintextAbort is returned from render1 when a <plaintext> element
// has been rendered. No more end tags should be rendered after that.
var plaintextAbort = errors.New("html: internal error (plaintext abort)")
func render(w writer, n *Node) error {
err := render1(w, n)
if err == plaintextAbort {
err = nil
}
return err
}
func render1(w writer, n *Node) error {
// Render non-element nodes; these are the easy cases.
switch n.Type {
case ErrorNode:
return errors.New("html: cannot render an ErrorNode node")
case TextNode:
return escape(w, n.Data)
case DocumentNode:
for c := n.FirstChild; c != nil; c = c.NextSibling {
if err := render1(w, c); err != nil {
return err
}
}
return nil
case ElementNode:
// No-op.
case CommentNode:
if _, err := w.WriteString("<!--"); err != nil {
return err
}
if _, err := w.WriteString(n.Data); err != nil {
return err
}
if _, err := w.WriteString("-->"); err != nil {
return err
}
return nil
case DoctypeNode:
if _, err := w.WriteString("<!DOCTYPE "); err != nil {
return err
}
if _, err := w.WriteString(n.Data); err != nil {
return err
}
if n.Attr != nil {
var p, s string
for _, a := range n.Attr {
switch a.Key {
case "public":
p = a.Val
case "system":
s = a.Val
}
}
if p != "" {
if _, err := w.WriteString(" PUBLIC "); err != nil {
return err
}
if err := writeQuoted(w, p); err != nil {
return err
}
if s != "" {
if err := w.WriteByte(' '); err != nil {
return err
}
if err := writeQuoted(w, s); err != nil {
return err
}
}
} else if s != "" {
if _, err := w.WriteString(" SYSTEM "); err != nil {
return err
}
if err := writeQuoted(w, s); err != nil {
return err
}
}
}
return w.WriteByte('>')
default:
return errors.New("html: unknown node type")
}
// Render the <xxx> opening tag.
if err := w.WriteByte('<'); err != nil {
return err
}
if _, err := w.WriteString(n.Data); err != nil {
return err
}
for _, a := range n.Attr {
if err := w.WriteByte(' '); err != nil {
return err
}
if a.Namespace != "" {
if _, err := w.WriteString(a.Namespace); err != nil {
return err
}
if err := w.WriteByte(':'); err != nil {
return err
}
}
if _, err := w.WriteString(a.Key); err != nil {
return err
}
if _, err := w.WriteString(`="`); err != nil {
return err
}
if err := escape(w, a.Val); err != nil {
return err
}
if err := w.WriteByte('"'); err != nil {
return err
}
}
if voidElements[n.Data] {
if n.FirstChild != nil {
return fmt.Errorf("html: void element <%s> has child nodes", n.Data)
}
_, err := w.WriteString("/>")
return err
}
if err := w.WriteByte('>'); err != nil {
return err
}
// Add initial newline where there is danger of a newline beging ignored.
if c := n.FirstChild; c != nil && c.Type == TextNode && strings.HasPrefix(c.Data, "\n") {
switch n.Data {
case "pre", "listing", "textarea":
if err := w.WriteByte('\n'); err != nil {
return err
}
}
}
// Render any child nodes.
switch n.Data {
case "iframe", "noembed", "noframes", "noscript", "plaintext", "script", "style", "xmp":
for c := n.FirstChild; c != nil; c = c.NextSibling {
if c.Type == TextNode {
if _, err := w.WriteString(c.Data); err != nil {
return err
}
} else {
if err := render1(w, c); err != nil {
return err
}
}
}
if n.Data == "plaintext" {
// Don't render anything else. <plaintext> must be the
// last element in the file, with no closing tag.
return plaintextAbort
}
default:
for c := n.FirstChild; c != nil; c = c.NextSibling {
if err := render1(w, c); err != nil {
return err
}
}
}
// Render the </xxx> closing tag.
if _, err := w.WriteString("</"); err != nil {
return err
}
if _, err := w.WriteString(n.Data); err != nil {
return err
}
return w.WriteByte('>')
}
// writeQuoted writes s to w surrounded by quotes. Normally it will use double
// quotes, but if s contains a double quote, it will use single quotes.
// It is used for writing the identifiers in a doctype declaration.
// In valid HTML, they can't contain both types of quotes.
func writeQuoted(w writer, s string) error {
var q byte = '"'
if strings.Contains(s, `"`) {
q = '\''
}
if err := w.WriteByte(q); err != nil {
return err
}
if _, err := w.WriteString(s); err != nil {
return err
}
if err := w.WriteByte(q); err != nil {
return err
}
return nil
}
// Section 12.1.2, "Elements", gives this list of void elements. Void elements
// are those that can't have any contents.
var voidElements = map[string]bool{
"area": true,
"base": true,
"br": true,
"col": true,
"command": true,
"embed": true,
"hr": true,
"img": true,
"input": true,
"keygen": true,
"link": true,
"meta": true,
"param": true,
"source": true,
"track": true,
"wbr": true,
}

View File

@@ -0,0 +1,156 @@
// Copyright 2010 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
import (
"bytes"
"testing"
)
func TestRenderer(t *testing.T) {
nodes := [...]*Node{
0: {
Type: ElementNode,
Data: "html",
},
1: {
Type: ElementNode,
Data: "head",
},
2: {
Type: ElementNode,
Data: "body",
},
3: {
Type: TextNode,
Data: "0<1",
},
4: {
Type: ElementNode,
Data: "p",
Attr: []Attribute{
{
Key: "id",
Val: "A",
},
{
Key: "foo",
Val: `abc"def`,
},
},
},
5: {
Type: TextNode,
Data: "2",
},
6: {
Type: ElementNode,
Data: "b",
Attr: []Attribute{
{
Key: "empty",
Val: "",
},
},
},
7: {
Type: TextNode,
Data: "3",
},
8: {
Type: ElementNode,
Data: "i",
Attr: []Attribute{
{
Key: "backslash",
Val: `\`,
},
},
},
9: {
Type: TextNode,
Data: "&4",
},
10: {
Type: TextNode,
Data: "5",
},
11: {
Type: ElementNode,
Data: "blockquote",
},
12: {
Type: ElementNode,
Data: "br",
},
13: {
Type: TextNode,
Data: "6",
},
}
// Build a tree out of those nodes, based on a textual representation.
// Only the ".\t"s are significant. The trailing HTML-like text is
// just commentary. The "0:" prefixes are for easy cross-reference with
// the nodes array.
treeAsText := [...]string{
0: `<html>`,
1: `. <head>`,
2: `. <body>`,
3: `. . "0&lt;1"`,
4: `. . <p id="A" foo="abc&#34;def">`,
5: `. . . "2"`,
6: `. . . <b empty="">`,
7: `. . . . "3"`,
8: `. . . <i backslash="\">`,
9: `. . . . "&amp;4"`,
10: `. . "5"`,
11: `. . <blockquote>`,
12: `. . <br>`,
13: `. . "6"`,
}
if len(nodes) != len(treeAsText) {
t.Fatal("len(nodes) != len(treeAsText)")
}
var stack [8]*Node
for i, line := range treeAsText {
level := 0
for line[0] == '.' {
// Strip a leading ".\t".
line = line[2:]
level++
}
n := nodes[i]
if level == 0 {
if stack[0] != nil {
t.Fatal("multiple root nodes")
}
stack[0] = n
} else {
stack[level-1].AppendChild(n)
stack[level] = n
for i := level + 1; i < len(stack); i++ {
stack[i] = nil
}
}
// At each stage of tree construction, we check all nodes for consistency.
for j, m := range nodes {
if err := checkNodeConsistency(m); err != nil {
t.Fatalf("i=%d, j=%d: %v", i, j, err)
}
}
}
want := `<html><head></head><body>0&lt;1<p id="A" foo="abc&#34;def">` +
`2<b empty="">3</b><i backslash="\">&amp;4</i></p>` +
`5<blockquote></blockquote><br/>6</body></html>`
b := new(bytes.Buffer)
if err := Render(b, nodes[0]); err != nil {
t.Fatal(err)
}
if got := b.String(); got != want {
t.Errorf("got vs want:\n%s\n%s\n", got, want)
}
}

View File

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,28 @@
The *.dat files in this directory are copied from The WebKit Open Source
Project, specifically $WEBKITROOT/LayoutTests/html5lib/resources.
WebKit is licensed under a BSD style license.
http://webkit.org/coding/bsd-license.html says:
Copyright (C) 2009 Apple Inc. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS "AS IS" AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@@ -0,0 +1,194 @@
#data
<a><p></a></p>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| <p>
| <a>
#data
<a>1<p>2</a>3</p>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| "1"
| <p>
| <a>
| "2"
| "3"
#data
<a>1<button>2</a>3</button>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| "1"
| <button>
| <a>
| "2"
| "3"
#data
<a>1<b>2</a>3</b>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| "1"
| <b>
| "2"
| <b>
| "3"
#data
<a>1<div>2<div>3</a>4</div>5</div>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| "1"
| <div>
| <a>
| "2"
| <div>
| <a>
| "3"
| "4"
| "5"
#data
<table><a>1<p>2</a>3</p>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| "1"
| <p>
| <a>
| "2"
| "3"
| <table>
#data
<b><b><a><p></a>
#errors
#document
| <html>
| <head>
| <body>
| <b>
| <b>
| <a>
| <p>
| <a>
#data
<b><a><b><p></a>
#errors
#document
| <html>
| <head>
| <body>
| <b>
| <a>
| <b>
| <b>
| <p>
| <a>
#data
<a><b><b><p></a>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| <b>
| <b>
| <b>
| <b>
| <p>
| <a>
#data
<p>1<s id="A">2<b id="B">3</p>4</s>5</b>
#errors
#document
| <html>
| <head>
| <body>
| <p>
| "1"
| <s>
| id="A"
| "2"
| <b>
| id="B"
| "3"
| <s>
| id="A"
| <b>
| id="B"
| "4"
| <b>
| id="B"
| "5"
#data
<table><a>1<td>2</td>3</table>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| "1"
| <a>
| "3"
| <table>
| <tbody>
| <tr>
| <td>
| "2"
#data
<table>A<td>B</td>C</table>
#errors
#document
| <html>
| <head>
| <body>
| "AC"
| <table>
| <tbody>
| <tr>
| <td>
| "B"
#data
<a><svg><tr><input></a>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| <svg svg>
| <svg tr>
| <svg input>

View File

@@ -0,0 +1,31 @@
#data
<b>1<i>2<p>3</b>4
#errors
#document
| <html>
| <head>
| <body>
| <b>
| "1"
| <i>
| "2"
| <i>
| <p>
| <b>
| "3"
| "4"
#data
<a><div><style></style><address><a>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| <div>
| <a>
| <style>
| <address>
| <a>
| <a>

View File

@@ -0,0 +1,135 @@
#data
FOO<!-- BAR -->BAZ
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <!-- BAR -->
| "BAZ"
#data
FOO<!-- BAR --!>BAZ
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <!-- BAR -->
| "BAZ"
#data
FOO<!-- BAR -- >BAZ
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <!-- BAR -- >BAZ -->
#data
FOO<!-- BAR -- <QUX> -- MUX -->BAZ
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <!-- BAR -- <QUX> -- MUX -->
| "BAZ"
#data
FOO<!-- BAR -- <QUX> -- MUX --!>BAZ
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <!-- BAR -- <QUX> -- MUX -->
| "BAZ"
#data
FOO<!-- BAR -- <QUX> -- MUX -- >BAZ
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <!-- BAR -- <QUX> -- MUX -- >BAZ -->
#data
FOO<!---->BAZ
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <!-- -->
| "BAZ"
#data
FOO<!--->BAZ
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <!-- -->
| "BAZ"
#data
FOO<!-->BAZ
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <!-- -->
| "BAZ"
#data
<?xml version="1.0">Hi
#errors
#document
| <!-- ?xml version="1.0" -->
| <html>
| <head>
| <body>
| "Hi"
#data
<?xml version="1.0">
#errors
#document
| <!-- ?xml version="1.0" -->
| <html>
| <head>
| <body>
#data
<?xml version
#errors
#document
| <!-- ?xml version -->
| <html>
| <head>
| <body>
#data
FOO<!----->BAZ
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <!-- - -->
| "BAZ"

View File

@@ -0,0 +1,370 @@
#data
<!DOCTYPE html>Hello
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "Hello"
#data
<!dOctYpE HtMl>Hello
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPEhtml>Hello
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE>Hello
#errors
#document
| <!DOCTYPE >
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE >Hello
#errors
#document
| <!DOCTYPE >
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato>Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato >Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato taco>Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato taco "ddd>Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato sYstEM>Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato sYstEM >Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato sYstEM ggg>Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato SYSTEM taco >Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato SYSTEM 'taco"'>Hello
#errors
#document
| <!DOCTYPE potato "" "taco"">
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato SYSTEM "taco">Hello
#errors
#document
| <!DOCTYPE potato "" "taco">
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato SYSTEM "tai'co">Hello
#errors
#document
| <!DOCTYPE potato "" "tai'co">
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato SYSTEMtaco "ddd">Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato grass SYSTEM taco>Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato pUbLIc>Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato pUbLIc >Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato pUbLIcgoof>Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato PUBLIC goof>Hello
#errors
#document
| <!DOCTYPE potato>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato PUBLIC "go'of">Hello
#errors
#document
| <!DOCTYPE potato "go'of" "">
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato PUBLIC 'go'of'>Hello
#errors
#document
| <!DOCTYPE potato "go" "">
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato PUBLIC 'go:hh of' >Hello
#errors
#document
| <!DOCTYPE potato "go:hh of" "">
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE potato PUBLIC "W3C-//dfdf" SYSTEM ggg>Hello
#errors
#document
| <!DOCTYPE potato "W3C-//dfdf" "">
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">Hello
#errors
#document
| <!DOCTYPE html "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE ...>Hello
#errors
#document
| <!DOCTYPE ...>
| <html>
| <head>
| <body>
| "Hello"
#data
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
#errors
#document
| <!DOCTYPE html "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
| <html>
| <head>
| <body>
#data
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
#errors
#document
| <!DOCTYPE html "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
| <html>
| <head>
| <body>
#data
<!DOCTYPE root-element [SYSTEM OR PUBLIC FPI] "uri" [
<!-- internal declarations -->
]>
#errors
#document
| <!DOCTYPE root-element>
| <html>
| <head>
| <body>
| "]>"
#data
<!DOCTYPE html PUBLIC
"-//WAPFORUM//DTD XHTML Mobile 1.0//EN"
"http://www.wapforum.org/DTD/xhtml-mobile10.dtd">
#errors
#document
| <!DOCTYPE html "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd">
| <html>
| <head>
| <body>
#data
<!DOCTYPE HTML SYSTEM "http://www.w3.org/DTD/HTML4-strict.dtd"><body><b>Mine!</b></body>
#errors
#document
| <!DOCTYPE html "" "http://www.w3.org/DTD/HTML4-strict.dtd">
| <html>
| <head>
| <body>
| <b>
| "Mine!"
#data
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
#errors
#document
| <!DOCTYPE html "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
| <html>
| <head>
| <body>
#data
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"'http://www.w3.org/TR/html4/strict.dtd'>
#errors
#document
| <!DOCTYPE html "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
| <html>
| <head>
| <body>
#data
<!DOCTYPE HTML PUBLIC"-//W3C//DTD HTML 4.01//EN"'http://www.w3.org/TR/html4/strict.dtd'>
#errors
#document
| <!DOCTYPE html "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
| <html>
| <head>
| <body>
#data
<!DOCTYPE HTML PUBLIC'-//W3C//DTD HTML 4.01//EN''http://www.w3.org/TR/html4/strict.dtd'>
#errors
#document
| <!DOCTYPE html "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
| <html>
| <head>
| <body>

View File

@@ -0,0 +1,603 @@
#data
FOO&gt;BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO>BAR"
#data
FOO&gtBAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO>BAR"
#data
FOO&gt BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO> BAR"
#data
FOO&gt;;;BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO>;;BAR"
#data
I'm &notit; I tell you
#errors
#document
| <html>
| <head>
| <body>
| "I'm ¬it; I tell you"
#data
I'm &notin; I tell you
#errors
#document
| <html>
| <head>
| <body>
| "I'm ∉ I tell you"
#data
FOO& BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO& BAR"
#data
FOO&<BAR>
#errors
#document
| <html>
| <head>
| <body>
| "FOO&"
| <bar>
#data
FOO&&&&gt;BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO&&&>BAR"
#data
FOO&#41;BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO)BAR"
#data
FOO&#x41;BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOOABAR"
#data
FOO&#X41;BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOOABAR"
#data
FOO&#BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO&#BAR"
#data
FOO&#ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO&#ZOO"
#data
FOO&#xBAR
#errors
#document
| <html>
| <head>
| <body>
| "FOOºR"
#data
FOO&#xZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO&#xZOO"
#data
FOO&#XZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO&#XZOO"
#data
FOO&#41BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO)BAR"
#data
FOO&#x41BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO䆺R"
#data
FOO&#x41ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOAZOO"
#data
FOO&#x0000;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO<4F>ZOO"
#data
FOO&#x0078;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOxZOO"
#data
FOO&#x0079;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOyZOO"
#data
FOO&#x0080;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO€ZOO"
#data
FOO&#x0081;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x0082;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x0083;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOƒZOO"
#data
FOO&#x0084;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO„ZOO"
#data
FOO&#x0085;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO…ZOO"
#data
FOO&#x0086;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO†ZOO"
#data
FOO&#x0087;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO‡ZOO"
#data
FOO&#x0088;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOˆZOO"
#data
FOO&#x0089;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO‰ZOO"
#data
FOO&#x008A;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOŠZOO"
#data
FOO&#x008B;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x008C;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOŒZOO"
#data
FOO&#x008D;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x008E;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOŽZOO"
#data
FOO&#x008F;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x0090;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x0091;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x0092;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x0093;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO“ZOO"
#data
FOO&#x0094;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO”ZOO"
#data
FOO&#x0095;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO•ZOO"
#data
FOO&#x0096;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x0097;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO—ZOO"
#data
FOO&#x0098;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO˜ZOO"
#data
FOO&#x0099;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO™ZOO"
#data
FOO&#x009A;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOšZOO"
#data
FOO&#x009B;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x009C;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOœZOO"
#data
FOO&#x009D;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x009E;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOžZOO"
#data
FOO&#x009F;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOŸZOO"
#data
FOO&#x00A0;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO ZOO"
#data
FOO&#xD7FF;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO퟿ZOO"
#data
FOO&#xD800;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO<4F>ZOO"
#data
FOO&#xD801;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO<4F>ZOO"
#data
FOO&#xDFFE;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO<4F>ZOO"
#data
FOO&#xDFFF;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO<4F>ZOO"
#data
FOO&#xE000;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOOZOO"
#data
FOO&#x10FFFE;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO􏿾ZOO"
#data
FOO&#x1087D4;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO􈟔ZOO"
#data
FOO&#x10FFFF;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO􏿿ZOO"
#data
FOO&#x110000;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO<4F>ZOO"
#data
FOO&#xFFFFFF;ZOO
#errors
#document
| <html>
| <head>
| <body>
| "FOO<4F>ZOO"

View File

@@ -0,0 +1,249 @@
#data
<div bar="ZZ&gt;YY"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ>YY"
#data
<div bar="ZZ&"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ&"
#data
<div bar='ZZ&'></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ&"
#data
<div bar=ZZ&></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ&"
#data
<div bar="ZZ&gt=YY"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ&gt=YY"
#data
<div bar="ZZ&gt0YY"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ&gt0YY"
#data
<div bar="ZZ&gt9YY"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ&gt9YY"
#data
<div bar="ZZ&gtaYY"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ&gtaYY"
#data
<div bar="ZZ&gtZYY"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ&gtZYY"
#data
<div bar="ZZ&gt YY"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ> YY"
#data
<div bar="ZZ&gt"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ>"
#data
<div bar='ZZ&gt'></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ>"
#data
<div bar=ZZ&gt></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ>"
#data
<div bar="ZZ&pound_id=23"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ£_id=23"
#data
<div bar="ZZ&prod_id=23"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ&prod_id=23"
#data
<div bar="ZZ&pound;_id=23"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ£_id=23"
#data
<div bar="ZZ&prod;_id=23"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ∏_id=23"
#data
<div bar="ZZ&pound=23"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ&pound=23"
#data
<div bar="ZZ&prod=23"></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| bar="ZZ&prod=23"
#data
<div>ZZ&pound_id=23</div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| "ZZ£_id=23"
#data
<div>ZZ&prod_id=23</div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| "ZZ&prod_id=23"
#data
<div>ZZ&pound;_id=23</div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| "ZZ£_id=23"
#data
<div>ZZ&prod;_id=23</div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| "ZZ∏_id=23"
#data
<div>ZZ&pound=23</div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| "ZZ£=23"
#data
<div>ZZ&prod=23</div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| "ZZ&prod=23"

View File

@@ -0,0 +1,246 @@
#data
<div<div>
#errors
#document
| <html>
| <head>
| <body>
| <div<div>
#data
<div foo<bar=''>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| foo<bar=""
#data
<div foo=`bar`>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| foo="`bar`"
#data
<div \"foo=''>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| \"foo=""
#data
<a href='\nbar'></a>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| href="\nbar"
#data
<!DOCTYPE html>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
#data
&lang;&rang;
#errors
#document
| <html>
| <head>
| <body>
| "⟨⟩"
#data
&apos;
#errors
#document
| <html>
| <head>
| <body>
| "'"
#data
&ImaginaryI;
#errors
#document
| <html>
| <head>
| <body>
| ""
#data
&Kopf;
#errors
#document
| <html>
| <head>
| <body>
| "𝕂"
#data
&notinva;
#errors
#document
| <html>
| <head>
| <body>
| "∉"
#data
<?import namespace="foo" implementation="#bar">
#errors
#document
| <!-- ?import namespace="foo" implementation="#bar" -->
| <html>
| <head>
| <body>
#data
<!--foo--bar-->
#errors
#document
| <!-- foo--bar -->
| <html>
| <head>
| <body>
#data
<![CDATA[x]]>
#errors
#document
| <!-- [CDATA[x]] -->
| <html>
| <head>
| <body>
#data
<textarea><!--</textarea>--></textarea>
#errors
#document
| <html>
| <head>
| <body>
| <textarea>
| "<!--"
| "-->"
#data
<textarea><!--</textarea>-->
#errors
#document
| <html>
| <head>
| <body>
| <textarea>
| "<!--"
| "-->"
#data
<style><!--</style>--></style>
#errors
#document
| <html>
| <head>
| <style>
| "<!--"
| <body>
| "-->"
#data
<style><!--</style>-->
#errors
#document
| <html>
| <head>
| <style>
| "<!--"
| <body>
| "-->"
#data
<ul><li>A </li> <li>B</li></ul>
#errors
#document
| <html>
| <head>
| <body>
| <ul>
| <li>
| "A "
| " "
| <li>
| "B"
#data
<table><form><input type=hidden><input></form><div></div></table>
#errors
#document
| <html>
| <head>
| <body>
| <input>
| <div>
| <table>
| <form>
| <input>
| type="hidden"
#data
<i>A<b>B<p></i>C</b>D
#errors
#document
| <html>
| <head>
| <body>
| <i>
| "A"
| <b>
| "B"
| <b>
| <p>
| <b>
| <i>
| "C"
| "D"
#data
<div></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
#data
<svg></svg>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
#data
<math></math>
#errors
#document
| <html>
| <head>
| <body>
| <math math>

View File

@@ -0,0 +1,43 @@
#data
<button>1</foo>
#errors
#document
| <html>
| <head>
| <body>
| <button>
| "1"
#data
<foo>1<p>2</foo>
#errors
#document
| <html>
| <head>
| <body>
| <foo>
| "1"
| <p>
| "2"
#data
<dd>1</foo>
#errors
#document
| <html>
| <head>
| <body>
| <dd>
| "1"
#data
<foo>1<dd>2</foo>
#errors
#document
| <html>
| <head>
| <body>
| <foo>
| "1"
| <dd>
| "2"

View File

@@ -0,0 +1,40 @@
#data
<isindex>
#errors
#document
| <html>
| <head>
| <body>
| <form>
| <hr>
| <label>
| "This is a searchable index. Enter search keywords: "
| <input>
| name="isindex"
| <hr>
#data
<isindex name="A" action="B" prompt="C" foo="D">
#errors
#document
| <html>
| <head>
| <body>
| <form>
| action="B"
| <hr>
| <label>
| "C"
| <input>
| foo="D"
| name="isindex"
| <hr>
#data
<form><isindex>
#errors
#document
| <html>
| <head>
| <body>
| <form>

View File

@@ -0,0 +1,52 @@
#data
<input type="hidden"><frameset>
#errors
21: Start tag seen without seeing a doctype first. Expected “<!DOCTYPE html>”.
31: “frameset” start tag seen.
31: End of file seen and there were open elements.
#document
| <html>
| <head>
| <frameset>
#data
<!DOCTYPE html><table><caption><svg>foo</table>bar
#errors
47: End tag “table” did not match the name of the current open element (“svg”).
47: “table” closed but “caption” was still open.
47: End tag “table” seen, but there were open elements.
36: Unclosed element “svg”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <caption>
| <svg svg>
| "foo"
| "bar"
#data
<table><tr><td><svg><desc><td></desc><circle>
#errors
7: Start tag seen without seeing a doctype first. Expected “<!DOCTYPE html>”.
30: A table cell was implicitly closed, but there were open elements.
26: Unclosed element “desc”.
20: Unclosed element “svg”.
37: Stray end tag “desc”.
45: End of file seen and there were open elements.
45: Unclosed element “circle”.
7: Unclosed element “table”.
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <svg svg>
| <svg desc>
| <td>
| <circle>

View File

Binary file not shown.

View File

@@ -0,0 +1,308 @@
#data
FOO<script>'Hello'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "'Hello'"
| "BAR"
#data
FOO<script></script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "BAR"
#data
FOO<script></script >BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "BAR"
#data
FOO<script></script/>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "BAR"
#data
FOO<script></script/ >BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "BAR"
#data
FOO<script type="text/plain"></scriptx>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| type="text/plain"
| "</scriptx>BAR"
#data
FOO<script></script foo=">" dd>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "BAR"
#data
FOO<script>'<'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "'<'"
| "BAR"
#data
FOO<script>'<!'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "'<!'"
| "BAR"
#data
FOO<script>'<!-'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "'<!-'"
| "BAR"
#data
FOO<script>'<!--'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "'<!--'"
| "BAR"
#data
FOO<script>'<!---'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "'<!---'"
| "BAR"
#data
FOO<script>'<!-->'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "'<!-->'"
| "BAR"
#data
FOO<script>'<!-->'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "'<!-->'"
| "BAR"
#data
FOO<script>'<!-- potato'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "'<!-- potato'"
| "BAR"
#data
FOO<script>'<!-- <sCrIpt'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "'<!-- <sCrIpt'"
| "BAR"
#data
FOO<script type="text/plain">'<!-- <sCrIpt>'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| type="text/plain"
| "'<!-- <sCrIpt>'</script>BAR"
#data
FOO<script type="text/plain">'<!-- <sCrIpt> -'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| type="text/plain"
| "'<!-- <sCrIpt> -'</script>BAR"
#data
FOO<script type="text/plain">'<!-- <sCrIpt> --'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| type="text/plain"
| "'<!-- <sCrIpt> --'</script>BAR"
#data
FOO<script>'<!-- <sCrIpt> -->'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| "'<!-- <sCrIpt> -->'"
| "BAR"
#data
FOO<script type="text/plain">'<!-- <sCrIpt> --!>'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| type="text/plain"
| "'<!-- <sCrIpt> --!>'</script>BAR"
#data
FOO<script type="text/plain">'<!-- <sCrIpt> -- >'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| type="text/plain"
| "'<!-- <sCrIpt> -- >'</script>BAR"
#data
FOO<script type="text/plain">'<!-- <sCrIpt '</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| type="text/plain"
| "'<!-- <sCrIpt '</script>BAR"
#data
FOO<script type="text/plain">'<!-- <sCrIpt/'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| type="text/plain"
| "'<!-- <sCrIpt/'</script>BAR"
#data
FOO<script type="text/plain">'<!-- <sCrIpt\'</script>BAR
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| type="text/plain"
| "'<!-- <sCrIpt\'"
| "BAR"
#data
FOO<script type="text/plain">'<!-- <sCrIpt/'</script>BAR</script>QUX
#errors
#document
| <html>
| <head>
| <body>
| "FOO"
| <script>
| type="text/plain"
| "'<!-- <sCrIpt/'</script>BAR"
| "QUX"

View File

@@ -0,0 +1,15 @@
#data
<p><b id="A"><script>document.getElementById("A").id = "B"</script></p>TEXT</b>
#errors
#document
| <html>
| <head>
| <body>
| <p>
| <b>
| id="B"
| <script>
| "document.getElementById("A").id = "B""
| <b>
| id="A"
| "TEXT"

View File

@@ -0,0 +1,28 @@
#data
1<script>document.write("2")</script>3
#errors
#document
| <html>
| <head>
| <body>
| "1"
| <script>
| "document.write("2")"
| "23"
#data
1<script>document.write("<script>document.write('2')</scr"+ "ipt><script>document.write('3')</scr" + "ipt>")</script>4
#errors
#document
| <html>
| <head>
| <body>
| "1"
| <script>
| "document.write("<script>document.write('2')</scr"+ "ipt><script>document.write('3')</scr" + "ipt>")"
| <script>
| "document.write('2')"
| "2"
| <script>
| "document.write('3')"
| "34"

View File

@@ -0,0 +1,212 @@
#data
<table><th>
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <th>
#data
<table><td>
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
#data
<table><col foo='bar'>
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <colgroup>
| <col>
| foo="bar"
#data
<table><colgroup></html>foo
#errors
#document
| <html>
| <head>
| <body>
| "foo"
| <table>
| <colgroup>
#data
<table></table><p>foo
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <p>
| "foo"
#data
<table></body></caption></col></colgroup></html></tbody></td></tfoot></th></thead></tr><td>
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
#data
<table><select><option>3</select></table>
#errors
#document
| <html>
| <head>
| <body>
| <select>
| <option>
| "3"
| <table>
#data
<table><select><table></table></select></table>
#errors
#document
| <html>
| <head>
| <body>
| <select>
| <table>
| <table>
#data
<table><select></table>
#errors
#document
| <html>
| <head>
| <body>
| <select>
| <table>
#data
<table><select><option>A<tr><td>B</td></tr></table>
#errors
#document
| <html>
| <head>
| <body>
| <select>
| <option>
| "A"
| <table>
| <tbody>
| <tr>
| <td>
| "B"
#data
<table><td></body></caption></col></colgroup></html>foo
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| "foo"
#data
<table><td>A</table>B
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| "A"
| "B"
#data
<table><tr><caption>
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <caption>
#data
<table><tr></body></caption></col></colgroup></html></td></th><td>foo
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| "foo"
#data
<table><td><tr>
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <tr>
#data
<table><td><button><td>
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <button>
| <td>
#data
<table><tr><td><svg><desc><td>
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <svg svg>
| <svg desc>
| <td>

View File

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,799 @@
#data
<!DOCTYPE html><svg></svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
#data
<!DOCTYPE html><svg></svg><![CDATA[a]]>
#errors
29: Bogus comment
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <!-- [CDATA[a]] -->
#data
<!DOCTYPE html><body><svg></svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
#data
<!DOCTYPE html><body><select><svg></svg></select>
#errors
35: Stray “svg” start tag.
42: Stray end tag “svg”
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
#data
<!DOCTYPE html><body><select><option><svg></svg></option></select>
#errors
43: Stray “svg” start tag.
50: Stray end tag “svg”
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <option>
#data
<!DOCTYPE html><body><table><svg></svg></table>
#errors
34: Start tag “svg” seen in “table”.
41: Stray end tag “svg”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <table>
#data
<!DOCTYPE html><body><table><svg><g>foo</g></svg></table>
#errors
34: Start tag “svg” seen in “table”.
46: Stray end tag “g”.
53: Stray end tag “svg”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg g>
| "foo"
| <table>
#data
<!DOCTYPE html><body><table><svg><g>foo</g><g>bar</g></svg></table>
#errors
34: Start tag “svg” seen in “table”.
46: Stray end tag “g”.
58: Stray end tag “g”.
65: Stray end tag “svg”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg g>
| "foo"
| <svg g>
| "bar"
| <table>
#data
<!DOCTYPE html><body><table><tbody><svg><g>foo</g><g>bar</g></svg></tbody></table>
#errors
41: Start tag “svg” seen in “table”.
53: Stray end tag “g”.
65: Stray end tag “g”.
72: Stray end tag “svg”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg g>
| "foo"
| <svg g>
| "bar"
| <table>
| <tbody>
#data
<!DOCTYPE html><body><table><tbody><tr><svg><g>foo</g><g>bar</g></svg></tr></tbody></table>
#errors
45: Start tag “svg” seen in “table”.
57: Stray end tag “g”.
69: Stray end tag “g”.
76: Stray end tag “svg”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg g>
| "foo"
| <svg g>
| "bar"
| <table>
| <tbody>
| <tr>
#data
<!DOCTYPE html><body><table><tbody><tr><td><svg><g>foo</g><g>bar</g></svg></td></tr></tbody></table>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <svg svg>
| <svg g>
| "foo"
| <svg g>
| "bar"
#data
<!DOCTYPE html><body><table><tbody><tr><td><svg><g>foo</g><g>bar</g></svg><p>baz</td></tr></tbody></table>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <svg svg>
| <svg g>
| "foo"
| <svg g>
| "bar"
| <p>
| "baz"
#data
<!DOCTYPE html><body><table><caption><svg><g>foo</g><g>bar</g></svg><p>baz</caption></table>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <caption>
| <svg svg>
| <svg g>
| "foo"
| <svg g>
| "bar"
| <p>
| "baz"
#data
<!DOCTYPE html><body><table><caption><svg><g>foo</g><g>bar</g><p>baz</table><p>quux
#errors
70: HTML start tag “p” in a foreign namespace context.
81: “table” closed but “caption” was still open.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <caption>
| <svg svg>
| <svg g>
| "foo"
| <svg g>
| "bar"
| <p>
| "baz"
| <p>
| "quux"
#data
<!DOCTYPE html><body><table><caption><svg><g>foo</g><g>bar</g>baz</table><p>quux
#errors
78: “table” closed but “caption” was still open.
78: Unclosed elements on stack.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <caption>
| <svg svg>
| <svg g>
| "foo"
| <svg g>
| "bar"
| "baz"
| <p>
| "quux"
#data
<!DOCTYPE html><body><table><colgroup><svg><g>foo</g><g>bar</g><p>baz</table><p>quux
#errors
44: Start tag “svg” seen in “table”.
56: Stray end tag “g”.
68: Stray end tag “g”.
71: HTML start tag “p” in a foreign namespace context.
71: Start tag “p” seen in “table”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg g>
| "foo"
| <svg g>
| "bar"
| <p>
| "baz"
| <table>
| <colgroup>
| <p>
| "quux"
#data
<!DOCTYPE html><body><table><tr><td><select><svg><g>foo</g><g>bar</g><p>baz</table><p>quux
#errors
50: Stray “svg” start tag.
54: Stray “g” start tag.
62: Stray end tag “g”
66: Stray “g” start tag.
74: Stray end tag “g”
77: Stray “p” start tag.
88: “table” end tag with “select” open.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <select>
| "foobarbaz"
| <p>
| "quux"
#data
<!DOCTYPE html><body><table><select><svg><g>foo</g><g>bar</g><p>baz</table><p>quux
#errors
36: Start tag “select” seen in “table”.
42: Stray “svg” start tag.
46: Stray “g” start tag.
54: Stray end tag “g”
58: Stray “g” start tag.
66: Stray end tag “g”
69: Stray “p” start tag.
80: “table” end tag with “select” open.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| "foobarbaz"
| <table>
| <p>
| "quux"
#data
<!DOCTYPE html><body></body></html><svg><g>foo</g><g>bar</g><p>baz
#errors
41: Stray “svg” start tag.
68: HTML start tag “p” in a foreign namespace context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg g>
| "foo"
| <svg g>
| "bar"
| <p>
| "baz"
#data
<!DOCTYPE html><body></body><svg><g>foo</g><g>bar</g><p>baz
#errors
34: Stray “svg” start tag.
61: HTML start tag “p” in a foreign namespace context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg g>
| "foo"
| <svg g>
| "bar"
| <p>
| "baz"
#data
<!DOCTYPE html><frameset><svg><g></g><g></g><p><span>
#errors
31: Stray “svg” start tag.
35: Stray “g” start tag.
40: Stray end tag “g”
44: Stray “g” start tag.
49: Stray end tag “g”
52: Stray “p” start tag.
58: Stray “span” start tag.
58: End of file seen and there were open elements.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <frameset>
#data
<!DOCTYPE html><frameset></frameset><svg><g></g><g></g><p><span>
#errors
42: Stray “svg” start tag.
46: Stray “g” start tag.
51: Stray end tag “g”
55: Stray “g” start tag.
60: Stray end tag “g”
63: Stray “p” start tag.
69: Stray “span” start tag.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <frameset>
#data
<!DOCTYPE html><body xlink:href=foo><svg xlink:href=foo></svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| xlink:href="foo"
| <svg svg>
| xlink href="foo"
#data
<!DOCTYPE html><body xlink:href=foo xml:lang=en><svg><g xml:lang=en xlink:href=foo></g></svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| xlink:href="foo"
| xml:lang="en"
| <svg svg>
| <svg g>
| xlink href="foo"
| xml lang="en"
#data
<!DOCTYPE html><body xlink:href=foo xml:lang=en><svg><g xml:lang=en xlink:href=foo /></svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| xlink:href="foo"
| xml:lang="en"
| <svg svg>
| <svg g>
| xlink href="foo"
| xml lang="en"
#data
<!DOCTYPE html><body xlink:href=foo xml:lang=en><svg><g xml:lang=en xlink:href=foo />bar</svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| xlink:href="foo"
| xml:lang="en"
| <svg svg>
| <svg g>
| xlink href="foo"
| xml lang="en"
| "bar"
#data
<svg></path>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
#data
<div><svg></div>a
#errors
#document
| <html>
| <head>
| <body>
| <div>
| <svg svg>
| "a"
#data
<div><svg><path></div>a
#errors
#document
| <html>
| <head>
| <body>
| <div>
| <svg svg>
| <svg path>
| "a"
#data
<div><svg><path></svg><path>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| <svg svg>
| <svg path>
| <path>
#data
<div><svg><path><foreignObject><math></div>a
#errors
#document
| <html>
| <head>
| <body>
| <div>
| <svg svg>
| <svg path>
| <svg foreignObject>
| <math math>
| "a"
#data
<div><svg><path><foreignObject><p></div>a
#errors
#document
| <html>
| <head>
| <body>
| <div>
| <svg svg>
| <svg path>
| <svg foreignObject>
| <p>
| "a"
#data
<!DOCTYPE html><svg><desc><div><svg><ul>a
#errors
40: HTML start tag “ul” in a foreign namespace context.
41: End of file in a foreign namespace context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg desc>
| <div>
| <svg svg>
| <ul>
| "a"
#data
<!DOCTYPE html><svg><desc><svg><ul>a
#errors
35: HTML start tag “ul” in a foreign namespace context.
36: End of file in a foreign namespace context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg desc>
| <svg svg>
| <ul>
| "a"
#data
<!DOCTYPE html><p><svg><desc><p>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <svg svg>
| <svg desc>
| <p>
#data
<!DOCTYPE html><p><svg><title><p>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <svg svg>
| <svg title>
| <p>
#data
<div><svg><path><foreignObject><p></foreignObject><p>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| <svg svg>
| <svg path>
| <svg foreignObject>
| <p>
| <p>
#data
<math><mi><div><object><div><span></span></div></object></div></mi><mi>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math mi>
| <div>
| <object>
| <div>
| <span>
| <math mi>
#data
<math><mi><svg><foreignObject><div><div></div></div></foreignObject></svg></mi><mi>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math mi>
| <svg svg>
| <svg foreignObject>
| <div>
| <div>
| <math mi>
#data
<svg><script></script><path>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| <svg script>
| <svg path>
#data
<table><svg></svg><tr>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| <table>
| <tbody>
| <tr>
#data
<math><mi><mglyph>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math mi>
| <math mglyph>
#data
<math><mi><malignmark>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math mi>
| <math malignmark>
#data
<math><mo><mglyph>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math mo>
| <math mglyph>
#data
<math><mo><malignmark>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math mo>
| <math malignmark>
#data
<math><mn><mglyph>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math mn>
| <math mglyph>
#data
<math><mn><malignmark>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math mn>
| <math malignmark>
#data
<math><ms><mglyph>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math ms>
| <math mglyph>
#data
<math><ms><malignmark>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math ms>
| <math malignmark>
#data
<math><mtext><mglyph>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math mtext>
| <math mglyph>
#data
<math><mtext><malignmark>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math mtext>
| <math malignmark>
#data
<math><annotation-xml><svg></svg></annotation-xml><mi>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math annotation-xml>
| <svg svg>
| <math mi>
#data
<math><annotation-xml><svg><foreignObject><div><math><mi></mi></math><span></span></div></foreignObject><path></path></svg></annotation-xml><mi>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math annotation-xml>
| <svg svg>
| <svg foreignObject>
| <div>
| <math math>
| <math mi>
| <span>
| <svg path>
| <math mi>
#data
<math><annotation-xml><svg><foreignObject><math><mi><svg></svg></mi><mo></mo></math><span></span></foreignObject><path></path></svg></annotation-xml><mi>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math annotation-xml>
| <svg svg>
| <svg foreignObject>
| <math math>
| <math mi>
| <svg svg>
| <math mo>
| <span>
| <svg path>
| <math mi>

View File

@@ -0,0 +1,482 @@
#data
<!DOCTYPE html><body><svg attributeName='' attributeType='' baseFrequency='' baseProfile='' calcMode='' clipPathUnits='' contentScriptType='' contentStyleType='' diffuseConstant='' edgeMode='' externalResourcesRequired='' filterRes='' filterUnits='' glyphRef='' gradientTransform='' gradientUnits='' kernelMatrix='' kernelUnitLength='' keyPoints='' keySplines='' keyTimes='' lengthAdjust='' limitingConeAngle='' markerHeight='' markerUnits='' markerWidth='' maskContentUnits='' maskUnits='' numOctaves='' pathLength='' patternContentUnits='' patternTransform='' patternUnits='' pointsAtX='' pointsAtY='' pointsAtZ='' preserveAlpha='' preserveAspectRatio='' primitiveUnits='' refX='' refY='' repeatCount='' repeatDur='' requiredExtensions='' requiredFeatures='' specularConstant='' specularExponent='' spreadMethod='' startOffset='' stdDeviation='' stitchTiles='' surfaceScale='' systemLanguage='' tableValues='' targetX='' targetY='' textLength='' viewBox='' viewTarget='' xChannelSelector='' yChannelSelector='' zoomAndPan=''></svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| attributeName=""
| attributeType=""
| baseFrequency=""
| baseProfile=""
| calcMode=""
| clipPathUnits=""
| contentScriptType=""
| contentStyleType=""
| diffuseConstant=""
| edgeMode=""
| externalResourcesRequired=""
| filterRes=""
| filterUnits=""
| glyphRef=""
| gradientTransform=""
| gradientUnits=""
| kernelMatrix=""
| kernelUnitLength=""
| keyPoints=""
| keySplines=""
| keyTimes=""
| lengthAdjust=""
| limitingConeAngle=""
| markerHeight=""
| markerUnits=""
| markerWidth=""
| maskContentUnits=""
| maskUnits=""
| numOctaves=""
| pathLength=""
| patternContentUnits=""
| patternTransform=""
| patternUnits=""
| pointsAtX=""
| pointsAtY=""
| pointsAtZ=""
| preserveAlpha=""
| preserveAspectRatio=""
| primitiveUnits=""
| refX=""
| refY=""
| repeatCount=""
| repeatDur=""
| requiredExtensions=""
| requiredFeatures=""
| specularConstant=""
| specularExponent=""
| spreadMethod=""
| startOffset=""
| stdDeviation=""
| stitchTiles=""
| surfaceScale=""
| systemLanguage=""
| tableValues=""
| targetX=""
| targetY=""
| textLength=""
| viewBox=""
| viewTarget=""
| xChannelSelector=""
| yChannelSelector=""
| zoomAndPan=""
#data
<!DOCTYPE html><BODY><SVG ATTRIBUTENAME='' ATTRIBUTETYPE='' BASEFREQUENCY='' BASEPROFILE='' CALCMODE='' CLIPPATHUNITS='' CONTENTSCRIPTTYPE='' CONTENTSTYLETYPE='' DIFFUSECONSTANT='' EDGEMODE='' EXTERNALRESOURCESREQUIRED='' FILTERRES='' FILTERUNITS='' GLYPHREF='' GRADIENTTRANSFORM='' GRADIENTUNITS='' KERNELMATRIX='' KERNELUNITLENGTH='' KEYPOINTS='' KEYSPLINES='' KEYTIMES='' LENGTHADJUST='' LIMITINGCONEANGLE='' MARKERHEIGHT='' MARKERUNITS='' MARKERWIDTH='' MASKCONTENTUNITS='' MASKUNITS='' NUMOCTAVES='' PATHLENGTH='' PATTERNCONTENTUNITS='' PATTERNTRANSFORM='' PATTERNUNITS='' POINTSATX='' POINTSATY='' POINTSATZ='' PRESERVEALPHA='' PRESERVEASPECTRATIO='' PRIMITIVEUNITS='' REFX='' REFY='' REPEATCOUNT='' REPEATDUR='' REQUIREDEXTENSIONS='' REQUIREDFEATURES='' SPECULARCONSTANT='' SPECULAREXPONENT='' SPREADMETHOD='' STARTOFFSET='' STDDEVIATION='' STITCHTILES='' SURFACESCALE='' SYSTEMLANGUAGE='' TABLEVALUES='' TARGETX='' TARGETY='' TEXTLENGTH='' VIEWBOX='' VIEWTARGET='' XCHANNELSELECTOR='' YCHANNELSELECTOR='' ZOOMANDPAN=''></SVG>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| attributeName=""
| attributeType=""
| baseFrequency=""
| baseProfile=""
| calcMode=""
| clipPathUnits=""
| contentScriptType=""
| contentStyleType=""
| diffuseConstant=""
| edgeMode=""
| externalResourcesRequired=""
| filterRes=""
| filterUnits=""
| glyphRef=""
| gradientTransform=""
| gradientUnits=""
| kernelMatrix=""
| kernelUnitLength=""
| keyPoints=""
| keySplines=""
| keyTimes=""
| lengthAdjust=""
| limitingConeAngle=""
| markerHeight=""
| markerUnits=""
| markerWidth=""
| maskContentUnits=""
| maskUnits=""
| numOctaves=""
| pathLength=""
| patternContentUnits=""
| patternTransform=""
| patternUnits=""
| pointsAtX=""
| pointsAtY=""
| pointsAtZ=""
| preserveAlpha=""
| preserveAspectRatio=""
| primitiveUnits=""
| refX=""
| refY=""
| repeatCount=""
| repeatDur=""
| requiredExtensions=""
| requiredFeatures=""
| specularConstant=""
| specularExponent=""
| spreadMethod=""
| startOffset=""
| stdDeviation=""
| stitchTiles=""
| surfaceScale=""
| systemLanguage=""
| tableValues=""
| targetX=""
| targetY=""
| textLength=""
| viewBox=""
| viewTarget=""
| xChannelSelector=""
| yChannelSelector=""
| zoomAndPan=""
#data
<!DOCTYPE html><body><svg attributename='' attributetype='' basefrequency='' baseprofile='' calcmode='' clippathunits='' contentscripttype='' contentstyletype='' diffuseconstant='' edgemode='' externalresourcesrequired='' filterres='' filterunits='' glyphref='' gradienttransform='' gradientunits='' kernelmatrix='' kernelunitlength='' keypoints='' keysplines='' keytimes='' lengthadjust='' limitingconeangle='' markerheight='' markerunits='' markerwidth='' maskcontentunits='' maskunits='' numoctaves='' pathlength='' patterncontentunits='' patterntransform='' patternunits='' pointsatx='' pointsaty='' pointsatz='' preservealpha='' preserveaspectratio='' primitiveunits='' refx='' refy='' repeatcount='' repeatdur='' requiredextensions='' requiredfeatures='' specularconstant='' specularexponent='' spreadmethod='' startoffset='' stddeviation='' stitchtiles='' surfacescale='' systemlanguage='' tablevalues='' targetx='' targety='' textlength='' viewbox='' viewtarget='' xchannelselector='' ychannelselector='' zoomandpan=''></svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| attributeName=""
| attributeType=""
| baseFrequency=""
| baseProfile=""
| calcMode=""
| clipPathUnits=""
| contentScriptType=""
| contentStyleType=""
| diffuseConstant=""
| edgeMode=""
| externalResourcesRequired=""
| filterRes=""
| filterUnits=""
| glyphRef=""
| gradientTransform=""
| gradientUnits=""
| kernelMatrix=""
| kernelUnitLength=""
| keyPoints=""
| keySplines=""
| keyTimes=""
| lengthAdjust=""
| limitingConeAngle=""
| markerHeight=""
| markerUnits=""
| markerWidth=""
| maskContentUnits=""
| maskUnits=""
| numOctaves=""
| pathLength=""
| patternContentUnits=""
| patternTransform=""
| patternUnits=""
| pointsAtX=""
| pointsAtY=""
| pointsAtZ=""
| preserveAlpha=""
| preserveAspectRatio=""
| primitiveUnits=""
| refX=""
| refY=""
| repeatCount=""
| repeatDur=""
| requiredExtensions=""
| requiredFeatures=""
| specularConstant=""
| specularExponent=""
| spreadMethod=""
| startOffset=""
| stdDeviation=""
| stitchTiles=""
| surfaceScale=""
| systemLanguage=""
| tableValues=""
| targetX=""
| targetY=""
| textLength=""
| viewBox=""
| viewTarget=""
| xChannelSelector=""
| yChannelSelector=""
| zoomAndPan=""
#data
<!DOCTYPE html><body><math attributeName='' attributeType='' baseFrequency='' baseProfile='' calcMode='' clipPathUnits='' contentScriptType='' contentStyleType='' diffuseConstant='' edgeMode='' externalResourcesRequired='' filterRes='' filterUnits='' glyphRef='' gradientTransform='' gradientUnits='' kernelMatrix='' kernelUnitLength='' keyPoints='' keySplines='' keyTimes='' lengthAdjust='' limitingConeAngle='' markerHeight='' markerUnits='' markerWidth='' maskContentUnits='' maskUnits='' numOctaves='' pathLength='' patternContentUnits='' patternTransform='' patternUnits='' pointsAtX='' pointsAtY='' pointsAtZ='' preserveAlpha='' preserveAspectRatio='' primitiveUnits='' refX='' refY='' repeatCount='' repeatDur='' requiredExtensions='' requiredFeatures='' specularConstant='' specularExponent='' spreadMethod='' startOffset='' stdDeviation='' stitchTiles='' surfaceScale='' systemLanguage='' tableValues='' targetX='' targetY='' textLength='' viewBox='' viewTarget='' xChannelSelector='' yChannelSelector='' zoomAndPan=''></math>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| attributename=""
| attributetype=""
| basefrequency=""
| baseprofile=""
| calcmode=""
| clippathunits=""
| contentscripttype=""
| contentstyletype=""
| diffuseconstant=""
| edgemode=""
| externalresourcesrequired=""
| filterres=""
| filterunits=""
| glyphref=""
| gradienttransform=""
| gradientunits=""
| kernelmatrix=""
| kernelunitlength=""
| keypoints=""
| keysplines=""
| keytimes=""
| lengthadjust=""
| limitingconeangle=""
| markerheight=""
| markerunits=""
| markerwidth=""
| maskcontentunits=""
| maskunits=""
| numoctaves=""
| pathlength=""
| patterncontentunits=""
| patterntransform=""
| patternunits=""
| pointsatx=""
| pointsaty=""
| pointsatz=""
| preservealpha=""
| preserveaspectratio=""
| primitiveunits=""
| refx=""
| refy=""
| repeatcount=""
| repeatdur=""
| requiredextensions=""
| requiredfeatures=""
| specularconstant=""
| specularexponent=""
| spreadmethod=""
| startoffset=""
| stddeviation=""
| stitchtiles=""
| surfacescale=""
| systemlanguage=""
| tablevalues=""
| targetx=""
| targety=""
| textlength=""
| viewbox=""
| viewtarget=""
| xchannelselector=""
| ychannelselector=""
| zoomandpan=""
#data
<!DOCTYPE html><body><svg><altGlyph /><altGlyphDef /><altGlyphItem /><animateColor /><animateMotion /><animateTransform /><clipPath /><feBlend /><feColorMatrix /><feComponentTransfer /><feComposite /><feConvolveMatrix /><feDiffuseLighting /><feDisplacementMap /><feDistantLight /><feFlood /><feFuncA /><feFuncB /><feFuncG /><feFuncR /><feGaussianBlur /><feImage /><feMerge /><feMergeNode /><feMorphology /><feOffset /><fePointLight /><feSpecularLighting /><feSpotLight /><feTile /><feTurbulence /><foreignObject /><glyphRef /><linearGradient /><radialGradient /><textPath /></svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg altGlyph>
| <svg altGlyphDef>
| <svg altGlyphItem>
| <svg animateColor>
| <svg animateMotion>
| <svg animateTransform>
| <svg clipPath>
| <svg feBlend>
| <svg feColorMatrix>
| <svg feComponentTransfer>
| <svg feComposite>
| <svg feConvolveMatrix>
| <svg feDiffuseLighting>
| <svg feDisplacementMap>
| <svg feDistantLight>
| <svg feFlood>
| <svg feFuncA>
| <svg feFuncB>
| <svg feFuncG>
| <svg feFuncR>
| <svg feGaussianBlur>
| <svg feImage>
| <svg feMerge>
| <svg feMergeNode>
| <svg feMorphology>
| <svg feOffset>
| <svg fePointLight>
| <svg feSpecularLighting>
| <svg feSpotLight>
| <svg feTile>
| <svg feTurbulence>
| <svg foreignObject>
| <svg glyphRef>
| <svg linearGradient>
| <svg radialGradient>
| <svg textPath>
#data
<!DOCTYPE html><body><svg><altglyph /><altglyphdef /><altglyphitem /><animatecolor /><animatemotion /><animatetransform /><clippath /><feblend /><fecolormatrix /><fecomponenttransfer /><fecomposite /><feconvolvematrix /><fediffuselighting /><fedisplacementmap /><fedistantlight /><feflood /><fefunca /><fefuncb /><fefuncg /><fefuncr /><fegaussianblur /><feimage /><femerge /><femergenode /><femorphology /><feoffset /><fepointlight /><fespecularlighting /><fespotlight /><fetile /><feturbulence /><foreignobject /><glyphref /><lineargradient /><radialgradient /><textpath /></svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg altGlyph>
| <svg altGlyphDef>
| <svg altGlyphItem>
| <svg animateColor>
| <svg animateMotion>
| <svg animateTransform>
| <svg clipPath>
| <svg feBlend>
| <svg feColorMatrix>
| <svg feComponentTransfer>
| <svg feComposite>
| <svg feConvolveMatrix>
| <svg feDiffuseLighting>
| <svg feDisplacementMap>
| <svg feDistantLight>
| <svg feFlood>
| <svg feFuncA>
| <svg feFuncB>
| <svg feFuncG>
| <svg feFuncR>
| <svg feGaussianBlur>
| <svg feImage>
| <svg feMerge>
| <svg feMergeNode>
| <svg feMorphology>
| <svg feOffset>
| <svg fePointLight>
| <svg feSpecularLighting>
| <svg feSpotLight>
| <svg feTile>
| <svg feTurbulence>
| <svg foreignObject>
| <svg glyphRef>
| <svg linearGradient>
| <svg radialGradient>
| <svg textPath>
#data
<!DOCTYPE html><BODY><SVG><ALTGLYPH /><ALTGLYPHDEF /><ALTGLYPHITEM /><ANIMATECOLOR /><ANIMATEMOTION /><ANIMATETRANSFORM /><CLIPPATH /><FEBLEND /><FECOLORMATRIX /><FECOMPONENTTRANSFER /><FECOMPOSITE /><FECONVOLVEMATRIX /><FEDIFFUSELIGHTING /><FEDISPLACEMENTMAP /><FEDISTANTLIGHT /><FEFLOOD /><FEFUNCA /><FEFUNCB /><FEFUNCG /><FEFUNCR /><FEGAUSSIANBLUR /><FEIMAGE /><FEMERGE /><FEMERGENODE /><FEMORPHOLOGY /><FEOFFSET /><FEPOINTLIGHT /><FESPECULARLIGHTING /><FESPOTLIGHT /><FETILE /><FETURBULENCE /><FOREIGNOBJECT /><GLYPHREF /><LINEARGRADIENT /><RADIALGRADIENT /><TEXTPATH /></SVG>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg altGlyph>
| <svg altGlyphDef>
| <svg altGlyphItem>
| <svg animateColor>
| <svg animateMotion>
| <svg animateTransform>
| <svg clipPath>
| <svg feBlend>
| <svg feColorMatrix>
| <svg feComponentTransfer>
| <svg feComposite>
| <svg feConvolveMatrix>
| <svg feDiffuseLighting>
| <svg feDisplacementMap>
| <svg feDistantLight>
| <svg feFlood>
| <svg feFuncA>
| <svg feFuncB>
| <svg feFuncG>
| <svg feFuncR>
| <svg feGaussianBlur>
| <svg feImage>
| <svg feMerge>
| <svg feMergeNode>
| <svg feMorphology>
| <svg feOffset>
| <svg fePointLight>
| <svg feSpecularLighting>
| <svg feSpotLight>
| <svg feTile>
| <svg feTurbulence>
| <svg foreignObject>
| <svg glyphRef>
| <svg linearGradient>
| <svg radialGradient>
| <svg textPath>
#data
<!DOCTYPE html><body><math><altGlyph /><altGlyphDef /><altGlyphItem /><animateColor /><animateMotion /><animateTransform /><clipPath /><feBlend /><feColorMatrix /><feComponentTransfer /><feComposite /><feConvolveMatrix /><feDiffuseLighting /><feDisplacementMap /><feDistantLight /><feFlood /><feFuncA /><feFuncB /><feFuncG /><feFuncR /><feGaussianBlur /><feImage /><feMerge /><feMergeNode /><feMorphology /><feOffset /><fePointLight /><feSpecularLighting /><feSpotLight /><feTile /><feTurbulence /><foreignObject /><glyphRef /><linearGradient /><radialGradient /><textPath /></math>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <math altglyph>
| <math altglyphdef>
| <math altglyphitem>
| <math animatecolor>
| <math animatemotion>
| <math animatetransform>
| <math clippath>
| <math feblend>
| <math fecolormatrix>
| <math fecomponenttransfer>
| <math fecomposite>
| <math feconvolvematrix>
| <math fediffuselighting>
| <math fedisplacementmap>
| <math fedistantlight>
| <math feflood>
| <math fefunca>
| <math fefuncb>
| <math fefuncg>
| <math fefuncr>
| <math fegaussianblur>
| <math feimage>
| <math femerge>
| <math femergenode>
| <math femorphology>
| <math feoffset>
| <math fepointlight>
| <math fespecularlighting>
| <math fespotlight>
| <math fetile>
| <math feturbulence>
| <math foreignobject>
| <math glyphref>
| <math lineargradient>
| <math radialgradient>
| <math textpath>
#data
<!DOCTYPE html><body><svg><solidColor /></svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg solidcolor>

View File

@@ -0,0 +1,62 @@
#data
<!DOCTYPE html><body><p>foo<math><mtext><i>baz</i></mtext><annotation-xml><svg><desc><b>eggs</b></desc><g><foreignObject><P>spam<TABLE><tr><td><img></td></table></foreignObject></g><g>quux</g></svg></annotation-xml></math>bar
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| "foo"
| <math math>
| <math mtext>
| <i>
| "baz"
| <math annotation-xml>
| <svg svg>
| <svg desc>
| <b>
| "eggs"
| <svg g>
| <svg foreignObject>
| <p>
| "spam"
| <table>
| <tbody>
| <tr>
| <td>
| <img>
| <svg g>
| "quux"
| "bar"
#data
<!DOCTYPE html><body>foo<math><mtext><i>baz</i></mtext><annotation-xml><svg><desc><b>eggs</b></desc><g><foreignObject><P>spam<TABLE><tr><td><img></td></table></foreignObject></g><g>quux</g></svg></annotation-xml></math>bar
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "foo"
| <math math>
| <math mtext>
| <i>
| "baz"
| <math annotation-xml>
| <svg svg>
| <svg desc>
| <b>
| "eggs"
| <svg g>
| <svg foreignObject>
| <p>
| "spam"
| <table>
| <tbody>
| <tr>
| <td>
| <img>
| <svg g>
| "quux"
| "bar"

View File

@@ -0,0 +1,74 @@
#data
<!DOCTYPE html><html><body><xyz:abc></xyz:abc>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <xyz:abc>
#data
<!DOCTYPE html><html><body><xyz:abc></xyz:abc><span></span>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <xyz:abc>
| <span>
#data
<!DOCTYPE html><html><html abc:def=gh><xyz:abc></xyz:abc>
#errors
15: Unexpected start tag html
#document
| <!DOCTYPE html>
| <html>
| abc:def="gh"
| <head>
| <body>
| <xyz:abc>
#data
<!DOCTYPE html><html xml:lang=bar><html xml:lang=foo>
#errors
15: Unexpected start tag html
#document
| <!DOCTYPE html>
| <html>
| xml:lang="bar"
| <head>
| <body>
#data
<!DOCTYPE html><html 123=456>
#errors
#document
| <!DOCTYPE html>
| <html>
| 123="456"
| <head>
| <body>
#data
<!DOCTYPE html><html 123=456><html 789=012>
#errors
#document
| <!DOCTYPE html>
| <html>
| 123="456"
| 789="012"
| <head>
| <body>
#data
<!DOCTYPE html><html><body 789=012>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| 789="012"

View File

@@ -0,0 +1,208 @@
#data
<!DOCTYPE html><p><b><i><u></p> <p>X
#errors
Line: 1 Col: 31 Unexpected end tag (p). Ignored.
Line: 1 Col: 36 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <b>
| <i>
| <u>
| <b>
| <i>
| <u>
| " "
| <p>
| "X"
#data
<p><b><i><u></p>
<p>X
#errors
Line: 1 Col: 3 Unexpected start tag (p). Expected DOCTYPE.
Line: 1 Col: 16 Unexpected end tag (p). Ignored.
Line: 2 Col: 4 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <p>
| <b>
| <i>
| <u>
| <b>
| <i>
| <u>
| "
"
| <p>
| "X"
#data
<!doctype html></html> <head>
#errors
Line: 1 Col: 22 Unexpected end tag (html) after the (implied) root element.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| " "
#data
<!doctype html></body><meta>
#errors
Line: 1 Col: 22 Unexpected end tag (body) after the (implied) root element.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <meta>
#data
<html></html><!-- foo -->
#errors
Line: 1 Col: 6 Unexpected start tag (html). Expected DOCTYPE.
Line: 1 Col: 13 Unexpected end tag (html) after the (implied) root element.
#document
| <html>
| <head>
| <body>
| <!-- foo -->
#data
<!doctype html></body><title>X</title>
#errors
Line: 1 Col: 22 Unexpected end tag (body) after the (implied) root element.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <title>
| "X"
#data
<!doctype html><table> X<meta></table>
#errors
Line: 1 Col: 24 Unexpected non-space characters in table context caused voodoo mode.
Line: 1 Col: 30 Unexpected start tag (meta) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| " X"
| <meta>
| <table>
#data
<!doctype html><table> x</table>
#errors
Line: 1 Col: 24 Unexpected non-space characters in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| " x"
| <table>
#data
<!doctype html><table> x </table>
#errors
Line: 1 Col: 25 Unexpected non-space characters in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| " x "
| <table>
#data
<!doctype html><table><tr> x</table>
#errors
Line: 1 Col: 28 Unexpected non-space characters in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| " x"
| <table>
| <tbody>
| <tr>
#data
<!doctype html><table>X<style> <tr>x </style> </table>
#errors
Line: 1 Col: 23 Unexpected non-space characters in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "X"
| <table>
| <style>
| " <tr>x "
| " "
#data
<!doctype html><div><table><a>foo</a> <tr><td>bar</td> </tr></table></div>
#errors
Line: 1 Col: 30 Unexpected start tag (a) in table context caused voodoo mode.
Line: 1 Col: 37 Unexpected end tag (a) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <div>
| <a>
| "foo"
| <table>
| " "
| <tbody>
| <tr>
| <td>
| "bar"
| " "
#data
<frame></frame></frame><frameset><frame><frameset><frame></frameset><noframes></frameset><noframes>
#errors
6: Start tag seen without seeing a doctype first. Expected “<!DOCTYPE html>”.
13: Stray start tag “frame”.
21: Stray end tag “frame”.
29: Stray end tag “frame”.
39: “frameset” start tag after “body” already open.
105: End of file seen inside an [R]CDATA element.
105: End of file seen and there were open elements.
XXX: These errors are wrong, please fix me!
#document
| <html>
| <head>
| <frameset>
| <frame>
| <frameset>
| <frame>
| <noframes>
| "</frameset><noframes>"
#data
<!DOCTYPE html><object></html>
#errors
1: Expected closing tag. Unexpected end of file
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <object>

View File

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,153 @@
#data
<!doctype html><table><tbody><select><tr>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <table>
| <tbody>
| <tr>
#data
<!doctype html><table><tr><select><td>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <table>
| <tbody>
| <tr>
| <td>
#data
<!doctype html><table><tr><td><select><td>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <select>
| <td>
#data
<!doctype html><table><tr><th><select><td>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <th>
| <select>
| <td>
#data
<!doctype html><table><caption><select><tr>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <caption>
| <select>
| <tbody>
| <tr>
#data
<!doctype html><select><tr>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
#data
<!doctype html><select><td>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
#data
<!doctype html><select><th>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
#data
<!doctype html><select><tbody>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
#data
<!doctype html><select><thead>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
#data
<!doctype html><select><tfoot>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
#data
<!doctype html><select><caption>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
#data
<!doctype html><table><tr></table>a
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| "a"

View File

@@ -0,0 +1,269 @@
#data
<!doctype html><plaintext></plaintext>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <plaintext>
| "</plaintext>"
#data
<!doctype html><table><plaintext></plaintext>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <plaintext>
| "</plaintext>"
| <table>
#data
<!doctype html><table><tbody><plaintext></plaintext>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <plaintext>
| "</plaintext>"
| <table>
| <tbody>
#data
<!doctype html><table><tbody><tr><plaintext></plaintext>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <plaintext>
| "</plaintext>"
| <table>
| <tbody>
| <tr>
#data
<!doctype html><table><tbody><tr><plaintext></plaintext>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <plaintext>
| "</plaintext>"
| <table>
| <tbody>
| <tr>
#data
<!doctype html><table><td><plaintext></plaintext>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <plaintext>
| "</plaintext>"
#data
<!doctype html><table><caption><plaintext></plaintext>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <caption>
| <plaintext>
| "</plaintext>"
#data
<!doctype html><table><tr><style></script></style>abc
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "abc"
| <table>
| <tbody>
| <tr>
| <style>
| "</script>"
#data
<!doctype html><table><tr><script></style></script>abc
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "abc"
| <table>
| <tbody>
| <tr>
| <script>
| "</style>"
#data
<!doctype html><table><caption><style></script></style>abc
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <caption>
| <style>
| "</script>"
| "abc"
#data
<!doctype html><table><td><style></script></style>abc
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <style>
| "</script>"
| "abc"
#data
<!doctype html><select><script></style></script>abc
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <script>
| "</style>"
| "abc"
#data
<!doctype html><table><select><script></style></script>abc
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <script>
| "</style>"
| "abc"
| <table>
#data
<!doctype html><table><tr><select><script></style></script>abc
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <script>
| "</style>"
| "abc"
| <table>
| <tbody>
| <tr>
#data
<!doctype html><frameset></frameset><noframes>abc
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <frameset>
| <noframes>
| "abc"
#data
<!doctype html><frameset></frameset><noframes>abc</noframes><!--abc-->
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <frameset>
| <noframes>
| "abc"
| <!-- abc -->
#data
<!doctype html><frameset></frameset></html><noframes>abc
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <frameset>
| <noframes>
| "abc"
#data
<!doctype html><frameset></frameset></html><noframes>abc</noframes><!--abc-->
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <frameset>
| <noframes>
| "abc"
| <!-- abc -->
#data
<!doctype html><table><tr></tbody><tfoot>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <tfoot>
#data
<!doctype html><table><td><svg></svg>abc<td>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <svg svg>
| "abc"
| <td>

View File

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,763 @@
#data
<!DOCTYPE html>Test
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "Test"
#data
<textarea>test</div>test
#errors
Line: 1 Col: 10 Unexpected start tag (textarea). Expected DOCTYPE.
Line: 1 Col: 24 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <textarea>
| "test</div>test"
#data
<table><td>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 11 Unexpected table cell start tag (td) in the table body phase.
Line: 1 Col: 11 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
#data
<table><td>test</tbody></table>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 11 Unexpected table cell start tag (td) in the table body phase.
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| "test"
#data
<frame>test
#errors
Line: 1 Col: 7 Unexpected start tag (frame). Expected DOCTYPE.
Line: 1 Col: 7 Unexpected start tag frame. Ignored.
#document
| <html>
| <head>
| <body>
| "test"
#data
<!DOCTYPE html><frameset>test
#errors
Line: 1 Col: 29 Unepxected characters in the frameset phase. Characters ignored.
Line: 1 Col: 29 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <frameset>
#data
<!DOCTYPE html><frameset><!DOCTYPE html>
#errors
Line: 1 Col: 40 Unexpected DOCTYPE. Ignored.
Line: 1 Col: 40 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <frameset>
#data
<!DOCTYPE html><font><p><b>test</font>
#errors
Line: 1 Col: 38 End tag (font) violates step 1, paragraph 3 of the adoption agency algorithm.
Line: 1 Col: 38 End tag (font) violates step 1, paragraph 3 of the adoption agency algorithm.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <font>
| <p>
| <font>
| <b>
| "test"
#data
<!DOCTYPE html><dt><div><dd>
#errors
Line: 1 Col: 28 Missing end tag (div, dt).
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <dt>
| <div>
| <dd>
#data
<script></x
#errors
Line: 1 Col: 8 Unexpected start tag (script). Expected DOCTYPE.
Line: 1 Col: 11 Unexpected end of file. Expected end tag (script).
#document
| <html>
| <head>
| <script>
| "</x"
| <body>
#data
<table><plaintext><td>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 18 Unexpected start tag (plaintext) in table context caused voodoo mode.
Line: 1 Col: 22 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| <plaintext>
| "<td>"
| <table>
#data
<plaintext></plaintext>
#errors
Line: 1 Col: 11 Unexpected start tag (plaintext). Expected DOCTYPE.
Line: 1 Col: 23 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <plaintext>
| "</plaintext>"
#data
<!DOCTYPE html><table><tr>TEST
#errors
Line: 1 Col: 30 Unexpected non-space characters in table context caused voodoo mode.
Line: 1 Col: 30 Unexpected end of file. Expected table content.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "TEST"
| <table>
| <tbody>
| <tr>
#data
<!DOCTYPE html><body t1=1><body t2=2><body t3=3 t4=4>
#errors
Line: 1 Col: 37 Unexpected start tag (body).
Line: 1 Col: 53 Unexpected start tag (body).
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| t1="1"
| t2="2"
| t3="3"
| t4="4"
#data
</b test
#errors
Line: 1 Col: 8 Unexpected end of file in attribute name.
Line: 1 Col: 8 End tag contains unexpected attributes.
Line: 1 Col: 8 Unexpected end tag (b). Expected DOCTYPE.
Line: 1 Col: 8 Unexpected end tag (b) after the (implied) root element.
#document
| <html>
| <head>
| <body>
#data
<!DOCTYPE html></b test<b &=&amp>X
#errors
Line: 1 Col: 32 Named entity didn't end with ';'.
Line: 1 Col: 33 End tag contains unexpected attributes.
Line: 1 Col: 33 Unexpected end tag (b) after the (implied) root element.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "X"
#data
<!doctypehtml><scrIPt type=text/x-foobar;baz>X</SCRipt
#errors
Line: 1 Col: 9 No space after literal string 'DOCTYPE'.
Line: 1 Col: 54 Unexpected end of file in the tag name.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <script>
| type="text/x-foobar;baz"
| "X</SCRipt"
| <body>
#data
&
#errors
Line: 1 Col: 1 Unexpected non-space characters. Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| "&"
#data
&#
#errors
Line: 1 Col: 1 Numeric entity expected. Got end of file instead.
Line: 1 Col: 1 Unexpected non-space characters. Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| "&#"
#data
&#X
#errors
Line: 1 Col: 3 Numeric entity expected but none found.
Line: 1 Col: 3 Unexpected non-space characters. Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| "&#X"
#data
&#x
#errors
Line: 1 Col: 3 Numeric entity expected but none found.
Line: 1 Col: 3 Unexpected non-space characters. Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| "&#x"
#data
&#45
#errors
Line: 1 Col: 4 Numeric entity didn't end with ';'.
Line: 1 Col: 4 Unexpected non-space characters. Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| "-"
#data
&x-test
#errors
Line: 1 Col: 1 Named entity expected. Got none.
Line: 1 Col: 1 Unexpected non-space characters. Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| "&x-test"
#data
<!doctypehtml><p><li>
#errors
Line: 1 Col: 9 No space after literal string 'DOCTYPE'.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <li>
#data
<!doctypehtml><p><dt>
#errors
Line: 1 Col: 9 No space after literal string 'DOCTYPE'.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <dt>
#data
<!doctypehtml><p><dd>
#errors
Line: 1 Col: 9 No space after literal string 'DOCTYPE'.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <dd>
#data
<!doctypehtml><p><form>
#errors
Line: 1 Col: 9 No space after literal string 'DOCTYPE'.
Line: 1 Col: 23 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <form>
#data
<!DOCTYPE html><p></P>X
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| "X"
#data
&AMP
#errors
Line: 1 Col: 4 Named entity didn't end with ';'.
Line: 1 Col: 4 Unexpected non-space characters. Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| "&"
#data
&AMp;
#errors
Line: 1 Col: 1 Named entity expected. Got none.
Line: 1 Col: 1 Unexpected non-space characters. Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| "&AMp;"
#data
<!DOCTYPE html><html><head></head><body><thisISasillyTESTelementNameToMakeSureCrazyTagNamesArePARSEDcorrectLY>
#errors
Line: 1 Col: 110 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <thisisasillytestelementnametomakesurecrazytagnamesareparsedcorrectly>
#data
<!DOCTYPE html>X</body>X
#errors
Line: 1 Col: 24 Unexpected non-space characters in the after body phase.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "XX"
#data
<!DOCTYPE html><!-- X
#errors
Line: 1 Col: 21 Unexpected end of file in comment.
#document
| <!DOCTYPE html>
| <!-- X -->
| <html>
| <head>
| <body>
#data
<!DOCTYPE html><table><caption>test TEST</caption><td>test
#errors
Line: 1 Col: 54 Unexpected table cell start tag (td) in the table body phase.
Line: 1 Col: 58 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <caption>
| "test TEST"
| <tbody>
| <tr>
| <td>
| "test"
#data
<!DOCTYPE html><select><option><optgroup>
#errors
Line: 1 Col: 41 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <option>
| <optgroup>
#data
<!DOCTYPE html><select><optgroup><option></optgroup><option><select><option>
#errors
Line: 1 Col: 68 Unexpected select start tag in the select phase treated as select end tag.
Line: 1 Col: 76 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <optgroup>
| <option>
| <option>
| <option>
#data
<!DOCTYPE html><select><optgroup><option><optgroup>
#errors
Line: 1 Col: 51 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <optgroup>
| <option>
| <optgroup>
#data
<!DOCTYPE html><datalist><option>foo</datalist>bar
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <datalist>
| <option>
| "foo"
| "bar"
#data
<!DOCTYPE html><font><input><input></font>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <font>
| <input>
| <input>
#data
<!DOCTYPE html><!-- XXX - XXX -->
#errors
#document
| <!DOCTYPE html>
| <!-- XXX - XXX -->
| <html>
| <head>
| <body>
#data
<!DOCTYPE html><!-- XXX - XXX
#errors
Line: 1 Col: 29 Unexpected end of file in comment (-)
#document
| <!DOCTYPE html>
| <!-- XXX - XXX -->
| <html>
| <head>
| <body>
#data
<!DOCTYPE html><!-- XXX - XXX - XXX -->
#errors
#document
| <!DOCTYPE html>
| <!-- XXX - XXX - XXX -->
| <html>
| <head>
| <body>
#data
<isindex test=x name=x>
#errors
Line: 1 Col: 23 Unexpected start tag (isindex). Expected DOCTYPE.
Line: 1 Col: 23 Unexpected start tag isindex. Don't use it!
#document
| <html>
| <head>
| <body>
| <form>
| <hr>
| <label>
| "This is a searchable index. Enter search keywords: "
| <input>
| name="isindex"
| test="x"
| <hr>
#data
test
test
#errors
Line: 2 Col: 4 Unexpected non-space characters. Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| "test
test"
#data
<!DOCTYPE html><body><title>test</body></title>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <title>
| "test</body>"
#data
<!DOCTYPE html><body><title>X</title><meta name=z><link rel=foo><style>
x { content:"</style" } </style>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <title>
| "X"
| <meta>
| name="z"
| <link>
| rel="foo"
| <style>
| "
x { content:"</style" } "
#data
<!DOCTYPE html><select><optgroup></optgroup></select>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <optgroup>
#data
#errors
Line: 2 Col: 1 Unexpected End of file. Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
#data
<!DOCTYPE html> <html>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
#data
<!DOCTYPE html><script>
</script> <title>x</title> </head>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <script>
| "
"
| " "
| <title>
| "x"
| " "
| <body>
#data
<!DOCTYPE html><html><body><html id=x>
#errors
Line: 1 Col: 38 html needs to be the first start tag.
#document
| <!DOCTYPE html>
| <html>
| id="x"
| <head>
| <body>
#data
<!DOCTYPE html>X</body><html id="x">
#errors
Line: 1 Col: 36 Unexpected start tag token (html) in the after body phase.
Line: 1 Col: 36 html needs to be the first start tag.
#document
| <!DOCTYPE html>
| <html>
| id="x"
| <head>
| <body>
| "X"
#data
<!DOCTYPE html><head><html id=x>
#errors
Line: 1 Col: 32 html needs to be the first start tag.
#document
| <!DOCTYPE html>
| <html>
| id="x"
| <head>
| <body>
#data
<!DOCTYPE html>X</html>X
#errors
Line: 1 Col: 24 Unexpected non-space characters in the after body phase.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "XX"
#data
<!DOCTYPE html>X</html>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "X "
#data
<!DOCTYPE html>X</html><p>X
#errors
Line: 1 Col: 26 Unexpected start tag (p).
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "X"
| <p>
| "X"
#data
<!DOCTYPE html>X<p/x/y/z>
#errors
Line: 1 Col: 19 Expected a > after the /.
Line: 1 Col: 21 Solidus (/) incorrectly placed in tag.
Line: 1 Col: 23 Solidus (/) incorrectly placed in tag.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "X"
| <p>
| x=""
| y=""
| z=""
#data
<!DOCTYPE html><!--x--
#errors
Line: 1 Col: 22 Unexpected end of file in comment (--).
#document
| <!DOCTYPE html>
| <!-- x -->
| <html>
| <head>
| <body>
#data
<!DOCTYPE html><table><tr><td></p></table>
#errors
Line: 1 Col: 34 Unexpected end tag (p). Ignored.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <p>
#data
<!DOCTYPE <!DOCTYPE HTML>><!--<!--x-->-->
#errors
Line: 1 Col: 20 Expected space or '>'. Got ''
Line: 1 Col: 25 Erroneous DOCTYPE.
Line: 1 Col: 35 Unexpected character in comment found.
#document
| <!DOCTYPE <!doctype>
| <html>
| <head>
| <body>
| ">"
| <!-- <!--x -->
| "-->"
#data
<!doctype html><div><form></form><div></div></div>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <div>
| <form>
| <div>

View File

@@ -0,0 +1,455 @@
#data
<!doctype html><p><button><button>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <button>
#data
<!doctype html><p><button><address>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <address>
#data
<!doctype html><p><button><blockquote>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <blockquote>
#data
<!doctype html><p><button><menu>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <menu>
#data
<!doctype html><p><button><p>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <p>
#data
<!doctype html><p><button><ul>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <ul>
#data
<!doctype html><p><button><h1>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <h1>
#data
<!doctype html><p><button><h6>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <h6>
#data
<!doctype html><p><button><listing>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <listing>
#data
<!doctype html><p><button><pre>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <pre>
#data
<!doctype html><p><button><form>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <form>
#data
<!doctype html><p><button><li>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <li>
#data
<!doctype html><p><button><dd>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <dd>
#data
<!doctype html><p><button><dt>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <dt>
#data
<!doctype html><p><button><plaintext>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <plaintext>
#data
<!doctype html><p><button><table>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <table>
#data
<!doctype html><p><button><hr>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <hr>
#data
<!doctype html><p><button><xmp>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <xmp>
#data
<!doctype html><p><button></p>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <button>
| <p>
#data
<!doctype html><address><button></address>a
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <address>
| <button>
| "a"
#data
<!doctype html><address><button></address>a
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <address>
| <button>
| "a"
#data
<p><table></p>
#errors
#document
| <html>
| <head>
| <body>
| <p>
| <p>
| <table>
#data
<!doctype html><svg>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
#data
<!doctype html><p><figcaption>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <figcaption>
#data
<!doctype html><p><summary>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <summary>
#data
<!doctype html><form><table><form>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <form>
| <table>
#data
<!doctype html><table><form><form>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <form>
#data
<!doctype html><table><form></table><form>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <form>
#data
<!doctype html><svg><foreignObject><p>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg foreignObject>
| <p>
#data
<!doctype html><svg><title>abc
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg title>
| "abc"
#data
<option><span><option>
#errors
#document
| <html>
| <head>
| <body>
| <option>
| <span>
| <option>
#data
<option><option>
#errors
#document
| <html>
| <head>
| <body>
| <option>
| <option>
#data
<math><annotation-xml><div>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math annotation-xml>
| <div>
#data
<math><annotation-xml encoding="application/svg+xml"><div>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math annotation-xml>
| encoding="application/svg+xml"
| <div>
#data
<math><annotation-xml encoding="application/xhtml+xml"><div>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math annotation-xml>
| encoding="application/xhtml+xml"
| <div>
#data
<math><annotation-xml encoding="aPPlication/xhtmL+xMl"><div>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math annotation-xml>
| encoding="aPPlication/xhtmL+xMl"
| <div>
#data
<math><annotation-xml encoding="text/html"><div>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math annotation-xml>
| encoding="text/html"
| <div>
#data
<math><annotation-xml encoding="Text/htmL"><div>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math annotation-xml>
| encoding="Text/htmL"
| <div>
#data
<math><annotation-xml encoding=" text/html "><div>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math annotation-xml>
| encoding=" text/html "
| <div>

View File

@@ -0,0 +1,221 @@
#data
<svg><![CDATA[foo]]>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "foo"
#data
<math><![CDATA[foo]]>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| "foo"
#data
<div><![CDATA[foo]]>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| <!-- [CDATA[foo]] -->
#data
<svg><![CDATA[foo
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "foo"
#data
<svg><![CDATA[foo
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "foo"
#data
<svg><![CDATA[
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
#data
<svg><![CDATA[]]>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
#data
<svg><![CDATA[]] >]]>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "]] >"
#data
<svg><![CDATA[]] >]]>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "]] >"
#data
<svg><![CDATA[]]
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "]]"
#data
<svg><![CDATA[]
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "]"
#data
<svg><![CDATA[]>a
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "]>a"
#data
<svg><foreignObject><div><![CDATA[foo]]>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| <svg foreignObject>
| <div>
| <!-- [CDATA[foo]] -->
#data
<svg><![CDATA[<svg>]]>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "<svg>"
#data
<svg><![CDATA[</svg>a]]>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "</svg>a"
#data
<svg><![CDATA[<svg>a
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "<svg>a"
#data
<svg><![CDATA[</svg>a
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "</svg>a"
#data
<svg><![CDATA[<svg>]]><path>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "<svg>"
| <svg path>
#data
<svg><![CDATA[<svg>]]></path>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "<svg>"
#data
<svg><![CDATA[<svg>]]><!--path-->
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "<svg>"
| <!-- path -->
#data
<svg><![CDATA[<svg>]]>path
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "<svg>path"
#data
<svg><![CDATA[<!--svg-->]]>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| "<!--svg-->"

View File

@@ -0,0 +1,157 @@
#data
<a><b><big><em><strong><div>X</a>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| <b>
| <big>
| <em>
| <strong>
| <big>
| <em>
| <strong>
| <div>
| <a>
| "X"
#data
<a><b><div id=1><div id=2><div id=3><div id=4><div id=5><div id=6><div id=7><div id=8>A</a>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| <b>
| <b>
| <div>
| id="1"
| <a>
| <div>
| id="2"
| <a>
| <div>
| id="3"
| <a>
| <div>
| id="4"
| <a>
| <div>
| id="5"
| <a>
| <div>
| id="6"
| <a>
| <div>
| id="7"
| <a>
| <div>
| id="8"
| <a>
| "A"
#data
<a><b><div id=1><div id=2><div id=3><div id=4><div id=5><div id=6><div id=7><div id=8><div id=9>A</a>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| <b>
| <b>
| <div>
| id="1"
| <a>
| <div>
| id="2"
| <a>
| <div>
| id="3"
| <a>
| <div>
| id="4"
| <a>
| <div>
| id="5"
| <a>
| <div>
| id="6"
| <a>
| <div>
| id="7"
| <a>
| <div>
| id="8"
| <a>
| <div>
| id="9"
| "A"
#data
<a><b><div id=1><div id=2><div id=3><div id=4><div id=5><div id=6><div id=7><div id=8><div id=9><div id=10>A</a>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| <b>
| <b>
| <div>
| id="1"
| <a>
| <div>
| id="2"
| <a>
| <div>
| id="3"
| <a>
| <div>
| id="4"
| <a>
| <div>
| id="5"
| <a>
| <div>
| id="6"
| <a>
| <div>
| id="7"
| <a>
| <div>
| id="8"
| <a>
| <div>
| id="9"
| <div>
| id="10"
| "A"
#data
<cite><b><cite><i><cite><i><cite><i><div>X</b>TEST
#errors
Line: 1 Col: 6 Unexpected start tag (cite). Expected DOCTYPE.
Line: 1 Col: 46 End tag (b) violates step 1, paragraph 3 of the adoption agency algorithm.
Line: 1 Col: 50 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <cite>
| <b>
| <cite>
| <i>
| <cite>
| <i>
| <cite>
| <i>
| <i>
| <i>
| <div>
| <b>
| "X"
| "TEST"

View File

@@ -0,0 +1,155 @@
#data
<p><font size=4><font color=red><font size=4><font size=4><font size=4><font size=4><font size=4><font color=red><p>X
#errors
3: Start tag seen without seeing a doctype first. Expected “<!DOCTYPE html>”.
116: Unclosed elements.
117: End of file seen and there were open elements.
#document
| <html>
| <head>
| <body>
| <p>
| <font>
| size="4"
| <font>
| color="red"
| <font>
| size="4"
| <font>
| size="4"
| <font>
| size="4"
| <font>
| size="4"
| <font>
| size="4"
| <font>
| color="red"
| <p>
| <font>
| color="red"
| <font>
| size="4"
| <font>
| size="4"
| <font>
| size="4"
| <font>
| color="red"
| "X"
#data
<p><font size=4><font size=4><font size=4><font size=4><p>X
#errors
#document
| <html>
| <head>
| <body>
| <p>
| <font>
| size="4"
| <font>
| size="4"
| <font>
| size="4"
| <font>
| size="4"
| <p>
| <font>
| size="4"
| <font>
| size="4"
| <font>
| size="4"
| "X"
#data
<p><font size=4><font size=4><font size=4><font size="5"><font size=4><p>X
#errors
#document
| <html>
| <head>
| <body>
| <p>
| <font>
| size="4"
| <font>
| size="4"
| <font>
| size="4"
| <font>
| size="5"
| <font>
| size="4"
| <p>
| <font>
| size="4"
| <font>
| size="4"
| <font>
| size="5"
| <font>
| size="4"
| "X"
#data
<p><font size=4 id=a><font size=4 id=b><font size=4><font size=4><p>X
#errors
#document
| <html>
| <head>
| <body>
| <p>
| <font>
| id="a"
| size="4"
| <font>
| id="b"
| size="4"
| <font>
| size="4"
| <font>
| size="4"
| <p>
| <font>
| id="a"
| size="4"
| <font>
| id="b"
| size="4"
| <font>
| size="4"
| <font>
| size="4"
| "X"
#data
<p><b id=a><b id=a><b id=a><b><object><b id=a><b id=a>X</object><p>Y
#errors
#document
| <html>
| <head>
| <body>
| <p>
| <b>
| id="a"
| <b>
| id="a"
| <b>
| id="a"
| <b>
| <object>
| <b>
| id="a"
| <b>
| id="a"
| "X"
| <p>
| <b>
| id="a"
| <b>
| id="a"
| <b>
| id="a"
| <b>
| "Y"

View File

@@ -0,0 +1,79 @@
#data
<!DOCTYPE html>&NotEqualTilde;
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "≂̸"
#data
<!DOCTYPE html>&NotEqualTilde;A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "≂̸A"
#data
<!DOCTYPE html>&ThickSpace;
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| ""
#data
<!DOCTYPE html>&ThickSpace;A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "A"
#data
<!DOCTYPE html>&NotSubset;
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "⊂⃒"
#data
<!DOCTYPE html>&NotSubset;A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "⊂⃒A"
#data
<!DOCTYPE html>&Gopf;
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "𝔾"
#data
<!DOCTYPE html>&Gopf;A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "𝔾A"

View File

@@ -0,0 +1,219 @@
#data
<!DOCTYPE html><body><foo>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <foo>
| "A"
#data
<!DOCTYPE html><body><area>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <area>
| "A"
#data
<!DOCTYPE html><body><base>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <base>
| "A"
#data
<!DOCTYPE html><body><basefont>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <basefont>
| "A"
#data
<!DOCTYPE html><body><bgsound>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <bgsound>
| "A"
#data
<!DOCTYPE html><body><br>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <br>
| "A"
#data
<!DOCTYPE html><body><col>A
#errors
26: Stray start tag “col”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "A"
#data
<!DOCTYPE html><body><command>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <command>
| "A"
#data
<!DOCTYPE html><body><embed>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <embed>
| "A"
#data
<!DOCTYPE html><body><frame>A
#errors
26: Stray start tag “frame”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "A"
#data
<!DOCTYPE html><body><hr>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <hr>
| "A"
#data
<!DOCTYPE html><body><img>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <img>
| "A"
#data
<!DOCTYPE html><body><input>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <input>
| "A"
#data
<!DOCTYPE html><body><keygen>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <keygen>
| "A"
#data
<!DOCTYPE html><body><link>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <link>
| "A"
#data
<!DOCTYPE html><body><meta>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <meta>
| "A"
#data
<!DOCTYPE html><body><param>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <param>
| "A"
#data
<!DOCTYPE html><body><source>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <source>
| "A"
#data
<!DOCTYPE html><body><track>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <track>
| "A"
#data
<!DOCTYPE html><body><wbr>A
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <wbr>
| "A"

View File

@@ -0,0 +1,313 @@
#data
<!DOCTYPE html><body><a href='#1'><nobr>1<nobr></a><br><a href='#2'><nobr>2<nobr></a><br><a href='#3'><nobr>3<nobr></a>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <a>
| href="#1"
| <nobr>
| "1"
| <nobr>
| <nobr>
| <br>
| <a>
| href="#2"
| <a>
| href="#2"
| <nobr>
| "2"
| <nobr>
| <nobr>
| <br>
| <a>
| href="#3"
| <a>
| href="#3"
| <nobr>
| "3"
| <nobr>
#data
<!DOCTYPE html><body><b><nobr>1<nobr></b><i><nobr>2<nobr></i>3
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <b>
| <nobr>
| "1"
| <nobr>
| <nobr>
| <i>
| <i>
| <nobr>
| "2"
| <nobr>
| <nobr>
| "3"
#data
<!DOCTYPE html><body><b><nobr>1<table><nobr></b><i><nobr>2<nobr></i>3
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <b>
| <nobr>
| "1"
| <nobr>
| <i>
| <i>
| <nobr>
| "2"
| <nobr>
| <nobr>
| "3"
| <table>
#data
<!DOCTYPE html><body><b><nobr>1<table><tr><td><nobr></b><i><nobr>2<nobr></i>3
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <b>
| <nobr>
| "1"
| <table>
| <tbody>
| <tr>
| <td>
| <nobr>
| <i>
| <i>
| <nobr>
| "2"
| <nobr>
| <nobr>
| "3"
#data
<!DOCTYPE html><body><b><nobr>1<div><nobr></b><i><nobr>2<nobr></i>3
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <b>
| <nobr>
| "1"
| <div>
| <b>
| <nobr>
| <nobr>
| <nobr>
| <i>
| <i>
| <nobr>
| "2"
| <nobr>
| <nobr>
| "3"
#data
<!DOCTYPE html><body><b><nobr>1<nobr></b><div><i><nobr>2<nobr></i>3
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <b>
| <nobr>
| "1"
| <nobr>
| <div>
| <nobr>
| <i>
| <i>
| <nobr>
| "2"
| <nobr>
| <nobr>
| "3"
#data
<!DOCTYPE html><body><b><nobr>1<nobr><ins></b><i><nobr>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <b>
| <nobr>
| "1"
| <nobr>
| <ins>
| <nobr>
| <i>
| <i>
| <nobr>
#data
<!DOCTYPE html><body><b><nobr>1<ins><nobr></b><i>2
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <b>
| <nobr>
| "1"
| <ins>
| <nobr>
| <nobr>
| <i>
| "2"
#data
<!DOCTYPE html><body><b>1<nobr></b><i><nobr>2</i>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <b>
| "1"
| <nobr>
| <nobr>
| <i>
| <i>
| <nobr>
| "2"
#data
<p><code x</code></p>
#errors
#document
| <html>
| <head>
| <body>
| <p>
| <code>
| code=""
| x<=""
| <code>
| code=""
| x<=""
| "
"
#data
<!DOCTYPE html><svg><foreignObject><p><i></p>a
#errors
45: End tag “p” seen, but there were open elements.
41: Unclosed element “i”.
46: End of file seen and there were open elements.
35: Unclosed element “foreignObject”.
20: Unclosed element “svg”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <svg svg>
| <svg foreignObject>
| <p>
| <i>
| <i>
| "a"
#data
<!DOCTYPE html><table><tr><td><svg><foreignObject><p><i></p>a
#errors
56: End tag “p” seen, but there were open elements.
52: Unclosed element “i”.
57: End of file seen and there were open elements.
46: Unclosed element “foreignObject”.
31: Unclosed element “svg”.
22: Unclosed element “table”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <svg svg>
| <svg foreignObject>
| <p>
| <i>
| <i>
| "a"
#data
<!DOCTYPE html><math><mtext><p><i></p>a
#errors
38: End tag “p” seen, but there were open elements.
34: Unclosed element “i”.
39: End of file in a foreign namespace context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <math mtext>
| <p>
| <i>
| <i>
| "a"
#data
<!DOCTYPE html><table><tr><td><math><mtext><p><i></p>a
#errors
53: End tag “p” seen, but there were open elements.
49: Unclosed element “i”.
54: End of file in a foreign namespace context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <math math>
| <math mtext>
| <p>
| <i>
| <i>
| "a"
#data
<!DOCTYPE html><body><div><!/div>a
#errors
29: Bogus comment.
34: End of file seen and there were open elements.
26: Unclosed element “div”.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <div>
| <!-- /div -->
| "a"

View File

@@ -0,0 +1,305 @@
#data
<head></head><style></style>
#errors
Line: 1 Col: 6 Unexpected start tag (head). Expected DOCTYPE.
Line: 1 Col: 20 Unexpected start tag (style) that can be in head. Moved.
#document
| <html>
| <head>
| <style>
| <body>
#data
<head></head><script></script>
#errors
Line: 1 Col: 6 Unexpected start tag (head). Expected DOCTYPE.
Line: 1 Col: 21 Unexpected start tag (script) that can be in head. Moved.
#document
| <html>
| <head>
| <script>
| <body>
#data
<head></head><!-- --><style></style><!-- --><script></script>
#errors
Line: 1 Col: 6 Unexpected start tag (head). Expected DOCTYPE.
Line: 1 Col: 28 Unexpected start tag (style) that can be in head. Moved.
#document
| <html>
| <head>
| <style>
| <script>
| <!-- -->
| <!-- -->
| <body>
#data
<head></head><!-- -->x<style></style><!-- --><script></script>
#errors
Line: 1 Col: 6 Unexpected start tag (head). Expected DOCTYPE.
#document
| <html>
| <head>
| <!-- -->
| <body>
| "x"
| <style>
| <!-- -->
| <script>
#data
<!DOCTYPE html><html><head></head><body><pre>
</pre></body></html>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <pre>
#data
<!DOCTYPE html><html><head></head><body><pre>
foo</pre></body></html>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <pre>
| "foo"
#data
<!DOCTYPE html><html><head></head><body><pre>
foo</pre></body></html>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <pre>
| "
foo"
#data
<!DOCTYPE html><html><head></head><body><pre>
foo
</pre></body></html>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <pre>
| "foo
"
#data
<!DOCTYPE html><html><head></head><body><pre>x</pre><span>
</span></body></html>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <pre>
| "x"
| <span>
| "
"
#data
<!DOCTYPE html><html><head></head><body><pre>x
y</pre></body></html>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <pre>
| "x
y"
#data
<!DOCTYPE html><html><head></head><body><pre>x<div>
y</pre></body></html>
#errors
Line: 2 Col: 7 End tag (pre) seen too early. Expected other end tag.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <pre>
| "x"
| <div>
| "
y"
#data
<!DOCTYPE html><pre>&#x0a;&#x0a;A</pre>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <pre>
| "
A"
#data
<!DOCTYPE html><HTML><META><HEAD></HEAD></HTML>
#errors
Line: 1 Col: 33 Unexpected start tag head in existing head. Ignored.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <meta>
| <body>
#data
<!DOCTYPE html><HTML><HEAD><head></HEAD></HTML>
#errors
Line: 1 Col: 33 Unexpected start tag head in existing head. Ignored.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
#data
<textarea>foo<span>bar</span><i>baz
#errors
Line: 1 Col: 10 Unexpected start tag (textarea). Expected DOCTYPE.
Line: 1 Col: 35 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <textarea>
| "foo<span>bar</span><i>baz"
#data
<title>foo<span>bar</em><i>baz
#errors
Line: 1 Col: 7 Unexpected start tag (title). Expected DOCTYPE.
Line: 1 Col: 30 Unexpected end of file. Expected end tag (title).
#document
| <html>
| <head>
| <title>
| "foo<span>bar</em><i>baz"
| <body>
#data
<!DOCTYPE html><textarea>
</textarea>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <textarea>
#data
<!DOCTYPE html><textarea>
foo</textarea>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <textarea>
| "foo"
#data
<!DOCTYPE html><textarea>
foo</textarea>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <textarea>
| "
foo"
#data
<!DOCTYPE html><html><head></head><body><ul><li><div><p><li></ul></body></html>
#errors
Line: 1 Col: 60 Missing end tag (div, li).
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <ul>
| <li>
| <div>
| <p>
| <li>
#data
<!doctype html><nobr><nobr><nobr>
#errors
Line: 1 Col: 27 Unexpected start tag (nobr) implies end tag (nobr).
Line: 1 Col: 33 Unexpected start tag (nobr) implies end tag (nobr).
Line: 1 Col: 33 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <nobr>
| <nobr>
| <nobr>
#data
<!doctype html><nobr><nobr></nobr><nobr>
#errors
Line: 1 Col: 27 Unexpected start tag (nobr) implies end tag (nobr).
Line: 1 Col: 40 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <nobr>
| <nobr>
| <nobr>
#data
<!doctype html><html><body><p><table></table></body></html>
#errors
Not known
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <table>
#data
<p><table></table>
#errors
Not known
#document
| <html>
| <head>
| <body>
| <p>
| <table>

View File

@@ -0,0 +1,59 @@
#data
direct div content
#errors
#document-fragment
div
#document
| "direct div content"
#data
direct textarea content
#errors
#document-fragment
textarea
#document
| "direct textarea content"
#data
textarea content with <em>pseudo</em> <foo>markup
#errors
#document-fragment
textarea
#document
| "textarea content with <em>pseudo</em> <foo>markup"
#data
this is &#x0043;DATA inside a <style> element
#errors
#document-fragment
style
#document
| "this is &#x0043;DATA inside a <style> element"
#data
</plaintext>
#errors
#document-fragment
plaintext
#document
| "</plaintext>"
#data
setting html's innerHTML
#errors
Line: 1 Col: 24 Unexpected EOF in inner html mode.
#document-fragment
html
#document
| <head>
| <body>
| "setting html's innerHTML"
#data
<title>setting head's innerHTML</title>
#errors
#document-fragment
head
#document
| <title>
| "setting head's innerHTML"

View File

@@ -0,0 +1,191 @@
#data
<style> <!-- </style>x
#errors
Line: 1 Col: 7 Unexpected start tag (style). Expected DOCTYPE.
Line: 1 Col: 22 Unexpected end of file. Expected end tag (style).
#document
| <html>
| <head>
| <style>
| " <!-- "
| <body>
| "x"
#data
<style> <!-- </style> --> </style>x
#errors
Line: 1 Col: 7 Unexpected start tag (style). Expected DOCTYPE.
#document
| <html>
| <head>
| <style>
| " <!-- "
| " "
| <body>
| "--> x"
#data
<style> <!--> </style>x
#errors
Line: 1 Col: 7 Unexpected start tag (style). Expected DOCTYPE.
#document
| <html>
| <head>
| <style>
| " <!--> "
| <body>
| "x"
#data
<style> <!---> </style>x
#errors
Line: 1 Col: 7 Unexpected start tag (style). Expected DOCTYPE.
#document
| <html>
| <head>
| <style>
| " <!---> "
| <body>
| "x"
#data
<iframe> <!---> </iframe>x
#errors
Line: 1 Col: 8 Unexpected start tag (iframe). Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| <iframe>
| " <!---> "
| "x"
#data
<iframe> <!--- </iframe>->x</iframe> --> </iframe>x
#errors
Line: 1 Col: 8 Unexpected start tag (iframe). Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| <iframe>
| " <!--- "
| "->x --> x"
#data
<script> <!-- </script> --> </script>x
#errors
Line: 1 Col: 8 Unexpected start tag (script). Expected DOCTYPE.
#document
| <html>
| <head>
| <script>
| " <!-- "
| " "
| <body>
| "--> x"
#data
<title> <!-- </title> --> </title>x
#errors
Line: 1 Col: 7 Unexpected start tag (title). Expected DOCTYPE.
#document
| <html>
| <head>
| <title>
| " <!-- "
| " "
| <body>
| "--> x"
#data
<textarea> <!--- </textarea>->x</textarea> --> </textarea>x
#errors
Line: 1 Col: 10 Unexpected start tag (textarea). Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| <textarea>
| " <!--- "
| "->x --> x"
#data
<style> <!</-- </style>x
#errors
Line: 1 Col: 7 Unexpected start tag (style). Expected DOCTYPE.
#document
| <html>
| <head>
| <style>
| " <!</-- "
| <body>
| "x"
#data
<p><xmp></xmp>
#errors
XXX: Unknown
#document
| <html>
| <head>
| <body>
| <p>
| <xmp>
#data
<xmp> <!-- > --> </xmp>
#errors
Line: 1 Col: 5 Unexpected start tag (xmp). Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| <xmp>
| " <!-- > --> "
#data
<title>&amp;</title>
#errors
Line: 1 Col: 7 Unexpected start tag (title). Expected DOCTYPE.
#document
| <html>
| <head>
| <title>
| "&"
| <body>
#data
<title><!--&amp;--></title>
#errors
Line: 1 Col: 7 Unexpected start tag (title). Expected DOCTYPE.
#document
| <html>
| <head>
| <title>
| "<!--&-->"
| <body>
#data
<title><!--</title>
#errors
Line: 1 Col: 7 Unexpected start tag (title). Expected DOCTYPE.
Line: 1 Col: 19 Unexpected end of file. Expected end tag (title).
#document
| <html>
| <head>
| <title>
| "<!--"
| <body>
#data
<noscript><!--</noscript>--></noscript>
#errors
Line: 1 Col: 10 Unexpected start tag (noscript). Expected DOCTYPE.
#document
| <html>
| <head>
| <noscript>
| "<!--"
| <body>
| "-->"

View File

@@ -0,0 +1,663 @@
#data
<!doctype html></head> <head>
#errors
Line: 1 Col: 29 Unexpected start tag head. Ignored.
#document
| <!DOCTYPE html>
| <html>
| <head>
| " "
| <body>
#data
<!doctype html><form><div></form><div>
#errors
33: End tag "form" seen but there were unclosed elements.
38: End of file seen and there were open elements.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <form>
| <div>
| <div>
#data
<!doctype html><title>&amp;</title>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <title>
| "&"
| <body>
#data
<!doctype html><title><!--&amp;--></title>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <title>
| "<!--&-->"
| <body>
#data
<!doctype>
#errors
Line: 1 Col: 9 No space after literal string 'DOCTYPE'.
Line: 1 Col: 10 Unexpected > character. Expected DOCTYPE name.
Line: 1 Col: 10 Erroneous DOCTYPE.
#document
| <!DOCTYPE >
| <html>
| <head>
| <body>
#data
<!---x
#errors
Line: 1 Col: 6 Unexpected end of file in comment.
Line: 1 Col: 6 Unexpected End of file. Expected DOCTYPE.
#document
| <!-- -x -->
| <html>
| <head>
| <body>
#data
<body>
<div>
#errors
Line: 1 Col: 6 Unexpected start tag (body).
Line: 2 Col: 5 Expected closing tag. Unexpected end of file.
#document-fragment
div
#document
| "
"
| <div>
#data
<frameset></frameset>
foo
#errors
Line: 1 Col: 10 Unexpected start tag (frameset). Expected DOCTYPE.
Line: 2 Col: 3 Unexpected non-space characters in the after frameset phase. Ignored.
#document
| <html>
| <head>
| <frameset>
| "
"
#data
<frameset></frameset>
<noframes>
#errors
Line: 1 Col: 10 Unexpected start tag (frameset). Expected DOCTYPE.
Line: 2 Col: 10 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <frameset>
| "
"
| <noframes>
#data
<frameset></frameset>
<div>
#errors
Line: 1 Col: 10 Unexpected start tag (frameset). Expected DOCTYPE.
Line: 2 Col: 5 Unexpected start tag (div) in the after frameset phase. Ignored.
#document
| <html>
| <head>
| <frameset>
| "
"
#data
<frameset></frameset>
</html>
#errors
Line: 1 Col: 10 Unexpected start tag (frameset). Expected DOCTYPE.
#document
| <html>
| <head>
| <frameset>
| "
"
#data
<frameset></frameset>
</div>
#errors
Line: 1 Col: 10 Unexpected start tag (frameset). Expected DOCTYPE.
Line: 2 Col: 6 Unexpected end tag (div) in the after frameset phase. Ignored.
#document
| <html>
| <head>
| <frameset>
| "
"
#data
<form><form>
#errors
Line: 1 Col: 6 Unexpected start tag (form). Expected DOCTYPE.
Line: 1 Col: 12 Unexpected start tag (form).
Line: 1 Col: 12 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <form>
#data
<button><button>
#errors
Line: 1 Col: 8 Unexpected start tag (button). Expected DOCTYPE.
Line: 1 Col: 16 Unexpected start tag (button) implies end tag (button).
Line: 1 Col: 16 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <button>
| <button>
#data
<table><tr><td></th>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 20 Unexpected end tag (th). Ignored.
Line: 1 Col: 20 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
#data
<table><caption><td>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 20 Unexpected end tag (td). Ignored.
Line: 1 Col: 20 Unexpected table cell start tag (td) in the table body phase.
Line: 1 Col: 20 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <table>
| <caption>
| <tbody>
| <tr>
| <td>
#data
<table><caption><div>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 21 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <table>
| <caption>
| <div>
#data
</caption><div>
#errors
Line: 1 Col: 10 Unexpected end tag (caption). Ignored.
Line: 1 Col: 15 Expected closing tag. Unexpected end of file.
#document-fragment
caption
#document
| <div>
#data
<table><caption><div></caption>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 31 Unexpected end tag (caption). Missing end tag (div).
Line: 1 Col: 31 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| <table>
| <caption>
| <div>
#data
<table><caption></table>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 24 Unexpected end table tag in caption. Generates implied end caption.
#document
| <html>
| <head>
| <body>
| <table>
| <caption>
#data
</table><div>
#errors
Line: 1 Col: 8 Unexpected end table tag in caption. Generates implied end caption.
Line: 1 Col: 8 Unexpected end tag (caption). Ignored.
Line: 1 Col: 13 Expected closing tag. Unexpected end of file.
#document-fragment
caption
#document
| <div>
#data
<table><caption></body></col></colgroup></html></tbody></td></tfoot></th></thead></tr>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 23 Unexpected end tag (body). Ignored.
Line: 1 Col: 29 Unexpected end tag (col). Ignored.
Line: 1 Col: 40 Unexpected end tag (colgroup). Ignored.
Line: 1 Col: 47 Unexpected end tag (html). Ignored.
Line: 1 Col: 55 Unexpected end tag (tbody). Ignored.
Line: 1 Col: 60 Unexpected end tag (td). Ignored.
Line: 1 Col: 68 Unexpected end tag (tfoot). Ignored.
Line: 1 Col: 73 Unexpected end tag (th). Ignored.
Line: 1 Col: 81 Unexpected end tag (thead). Ignored.
Line: 1 Col: 86 Unexpected end tag (tr). Ignored.
Line: 1 Col: 86 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <table>
| <caption>
#data
<table><caption><div></div>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 27 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <table>
| <caption>
| <div>
#data
<table><tr><td></body></caption></col></colgroup></html>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 22 Unexpected end tag (body). Ignored.
Line: 1 Col: 32 Unexpected end tag (caption). Ignored.
Line: 1 Col: 38 Unexpected end tag (col). Ignored.
Line: 1 Col: 49 Unexpected end tag (colgroup). Ignored.
Line: 1 Col: 56 Unexpected end tag (html). Ignored.
Line: 1 Col: 56 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
#data
</table></tbody></tfoot></thead></tr><div>
#errors
Line: 1 Col: 8 Unexpected end tag (table). Ignored.
Line: 1 Col: 16 Unexpected end tag (tbody). Ignored.
Line: 1 Col: 24 Unexpected end tag (tfoot). Ignored.
Line: 1 Col: 32 Unexpected end tag (thead). Ignored.
Line: 1 Col: 37 Unexpected end tag (tr). Ignored.
Line: 1 Col: 42 Expected closing tag. Unexpected end of file.
#document-fragment
td
#document
| <div>
#data
<table><colgroup>foo
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 20 Unexpected non-space characters in table context caused voodoo mode.
Line: 1 Col: 20 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| "foo"
| <table>
| <colgroup>
#data
foo<col>
#errors
Line: 1 Col: 3 Unexpected end tag (colgroup). Ignored.
#document-fragment
colgroup
#document
| <col>
#data
<table><colgroup></col>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 23 This element (col) has no end tag.
Line: 1 Col: 23 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <table>
| <colgroup>
#data
<frameset><div>
#errors
Line: 1 Col: 10 Unexpected start tag (frameset). Expected DOCTYPE.
Line: 1 Col: 15 Unexpected start tag token (div) in the frameset phase. Ignored.
Line: 1 Col: 15 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <frameset>
#data
</frameset><frame>
#errors
Line: 1 Col: 11 Unexpected end tag token (frameset) in the frameset phase (innerHTML).
#document-fragment
frameset
#document
| <frame>
#data
<frameset></div>
#errors
Line: 1 Col: 10 Unexpected start tag (frameset). Expected DOCTYPE.
Line: 1 Col: 16 Unexpected end tag token (div) in the frameset phase. Ignored.
Line: 1 Col: 16 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <frameset>
#data
</body><div>
#errors
Line: 1 Col: 7 Unexpected end tag (body). Ignored.
Line: 1 Col: 12 Expected closing tag. Unexpected end of file.
#document-fragment
body
#document
| <div>
#data
<table><tr><div>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 16 Unexpected start tag (div) in table context caused voodoo mode.
Line: 1 Col: 16 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| <div>
| <table>
| <tbody>
| <tr>
#data
</tr><td>
#errors
Line: 1 Col: 5 Unexpected end tag (tr). Ignored.
#document-fragment
tr
#document
| <td>
#data
</tbody></tfoot></thead><td>
#errors
Line: 1 Col: 8 Unexpected end tag (tbody). Ignored.
Line: 1 Col: 16 Unexpected end tag (tfoot). Ignored.
Line: 1 Col: 24 Unexpected end tag (thead). Ignored.
#document-fragment
tr
#document
| <td>
#data
<table><tr><div><td>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 16 Unexpected start tag (div) in table context caused voodoo mode.
Line: 1 Col: 20 Unexpected implied end tag (div) in the table row phase.
Line: 1 Col: 20 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <div>
| <table>
| <tbody>
| <tr>
| <td>
#data
<caption><col><colgroup><tbody><tfoot><thead><tr>
#errors
Line: 1 Col: 9 Unexpected start tag (caption).
Line: 1 Col: 14 Unexpected start tag (col).
Line: 1 Col: 24 Unexpected start tag (colgroup).
Line: 1 Col: 31 Unexpected start tag (tbody).
Line: 1 Col: 38 Unexpected start tag (tfoot).
Line: 1 Col: 45 Unexpected start tag (thead).
Line: 1 Col: 49 Unexpected end of file. Expected table content.
#document-fragment
tbody
#document
| <tr>
#data
<table><tbody></thead>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 22 Unexpected end tag (thead) in the table body phase. Ignored.
Line: 1 Col: 22 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
#data
</table><tr>
#errors
Line: 1 Col: 8 Unexpected end tag (table). Ignored.
Line: 1 Col: 12 Unexpected end of file. Expected table content.
#document-fragment
tbody
#document
| <tr>
#data
<table><tbody></body></caption></col></colgroup></html></td></th></tr>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 21 Unexpected end tag (body) in the table body phase. Ignored.
Line: 1 Col: 31 Unexpected end tag (caption) in the table body phase. Ignored.
Line: 1 Col: 37 Unexpected end tag (col) in the table body phase. Ignored.
Line: 1 Col: 48 Unexpected end tag (colgroup) in the table body phase. Ignored.
Line: 1 Col: 55 Unexpected end tag (html) in the table body phase. Ignored.
Line: 1 Col: 60 Unexpected end tag (td) in the table body phase. Ignored.
Line: 1 Col: 65 Unexpected end tag (th) in the table body phase. Ignored.
Line: 1 Col: 70 Unexpected end tag (tr) in the table body phase. Ignored.
Line: 1 Col: 70 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
#data
<table><tbody></div>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 20 Unexpected end tag (div) in table context caused voodoo mode.
Line: 1 Col: 20 End tag (div) seen too early. Expected other end tag.
Line: 1 Col: 20 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
#data
<table><table>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 14 Unexpected start tag (table) implies end tag (table).
Line: 1 Col: 14 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| <table>
| <table>
#data
<table></body></caption></col></colgroup></html></tbody></td></tfoot></th></thead></tr>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 14 Unexpected end tag (body). Ignored.
Line: 1 Col: 24 Unexpected end tag (caption). Ignored.
Line: 1 Col: 30 Unexpected end tag (col). Ignored.
Line: 1 Col: 41 Unexpected end tag (colgroup). Ignored.
Line: 1 Col: 48 Unexpected end tag (html). Ignored.
Line: 1 Col: 56 Unexpected end tag (tbody). Ignored.
Line: 1 Col: 61 Unexpected end tag (td). Ignored.
Line: 1 Col: 69 Unexpected end tag (tfoot). Ignored.
Line: 1 Col: 74 Unexpected end tag (th). Ignored.
Line: 1 Col: 82 Unexpected end tag (thead). Ignored.
Line: 1 Col: 87 Unexpected end tag (tr). Ignored.
Line: 1 Col: 87 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| <table>
#data
</table><tr>
#errors
Line: 1 Col: 8 Unexpected end tag (table). Ignored.
Line: 1 Col: 12 Unexpected end of file. Expected table content.
#document-fragment
table
#document
| <tbody>
| <tr>
#data
<body></body></html>
#errors
Line: 1 Col: 20 Unexpected html end tag in inner html mode.
Line: 1 Col: 20 Unexpected EOF in inner html mode.
#document-fragment
html
#document
| <head>
| <body>
#data
<html><frameset></frameset></html>
#errors
Line: 1 Col: 6 Unexpected start tag (html). Expected DOCTYPE.
#document
| <html>
| <head>
| <frameset>
| " "
#data
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"><html></html>
#errors
Line: 1 Col: 50 Erroneous DOCTYPE.
Line: 1 Col: 63 Unexpected end tag (html) after the (implied) root element.
#document
| <!DOCTYPE html "-//W3C//DTD HTML 4.01//EN" "">
| <html>
| <head>
| <body>
#data
<param><frameset></frameset>
#errors
Line: 1 Col: 7 Unexpected start tag (param). Expected DOCTYPE.
Line: 1 Col: 17 Unexpected start tag (frameset).
#document
| <html>
| <head>
| <frameset>
#data
<source><frameset></frameset>
#errors
Line: 1 Col: 7 Unexpected start tag (source). Expected DOCTYPE.
Line: 1 Col: 17 Unexpected start tag (frameset).
#document
| <html>
| <head>
| <frameset>
#data
<track><frameset></frameset>
#errors
Line: 1 Col: 7 Unexpected start tag (track). Expected DOCTYPE.
Line: 1 Col: 17 Unexpected start tag (frameset).
#document
| <html>
| <head>
| <frameset>
#data
</html><frameset></frameset>
#errors
7: End tag seen without seeing a doctype first. Expected “<!DOCTYPE html>”.
17: Stray “frameset” start tag.
17: “frameset” start tag seen.
#document
| <html>
| <head>
| <frameset>
#data
</body><frameset></frameset>
#errors
7: End tag seen without seeing a doctype first. Expected “<!DOCTYPE html>”.
17: Stray “frameset” start tag.
17: “frameset” start tag seen.
#document
| <html>
| <head>
| <frameset>

View File

@@ -0,0 +1,390 @@
#data
<!doctype html><body><title>X</title>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <title>
| "X"
#data
<!doctype html><table><title>X</title></table>
#errors
Line: 1 Col: 29 Unexpected start tag (title) in table context caused voodoo mode.
Line: 1 Col: 38 Unexpected end tag (title) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <title>
| "X"
| <table>
#data
<!doctype html><head></head><title>X</title>
#errors
Line: 1 Col: 35 Unexpected start tag (title) that can be in head. Moved.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <title>
| "X"
| <body>
#data
<!doctype html></head><title>X</title>
#errors
Line: 1 Col: 29 Unexpected start tag (title) that can be in head. Moved.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <title>
| "X"
| <body>
#data
<!doctype html><table><meta></table>
#errors
Line: 1 Col: 28 Unexpected start tag (meta) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <meta>
| <table>
#data
<!doctype html><table>X<tr><td><table> <meta></table></table>
#errors
Line: 1 Col: 23 Unexpected non-space characters in table context caused voodoo mode.
Line: 1 Col: 45 Unexpected start tag (meta) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "X"
| <table>
| <tbody>
| <tr>
| <td>
| <meta>
| <table>
| " "
#data
<!doctype html><html> <head>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
#data
<!doctype html> <head>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
#data
<!doctype html><table><style> <tr>x </style> </table>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <style>
| " <tr>x "
| " "
#data
<!doctype html><table><TBODY><script> <tr>x </script> </table>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <script>
| " <tr>x "
| " "
#data
<!doctype html><p><applet><p>X</p></applet>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <p>
| <applet>
| <p>
| "X"
#data
<!doctype html><listing>
X</listing>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <listing>
| "X"
#data
<!doctype html><select><input>X
#errors
Line: 1 Col: 30 Unexpected input start tag in the select phase.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <input>
| "X"
#data
<!doctype html><select><select>X
#errors
Line: 1 Col: 31 Unexpected select start tag in the select phase treated as select end tag.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| "X"
#data
<!doctype html><table><input type=hidDEN></table>
#errors
Line: 1 Col: 41 Unexpected input with type hidden in table context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <input>
| type="hidDEN"
#data
<!doctype html><table>X<input type=hidDEN></table>
#errors
Line: 1 Col: 23 Unexpected non-space characters in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| "X"
| <table>
| <input>
| type="hidDEN"
#data
<!doctype html><table> <input type=hidDEN></table>
#errors
Line: 1 Col: 43 Unexpected input with type hidden in table context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| " "
| <input>
| type="hidDEN"
#data
<!doctype html><table> <input type='hidDEN'></table>
#errors
Line: 1 Col: 45 Unexpected input with type hidden in table context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| " "
| <input>
| type="hidDEN"
#data
<!doctype html><table><input type=" hidden"><input type=hidDEN></table>
#errors
Line: 1 Col: 44 Unexpected start tag (input) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <input>
| type=" hidden"
| <table>
| <input>
| type="hidDEN"
#data
<!doctype html><table><select>X<tr>
#errors
Line: 1 Col: 30 Unexpected start tag (select) in table context caused voodoo mode.
Line: 1 Col: 35 Unexpected table element start tag (trs) in the select in table phase.
Line: 1 Col: 35 Unexpected end of file. Expected table content.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| "X"
| <table>
| <tbody>
| <tr>
#data
<!doctype html><select>X</select>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| "X"
#data
<!DOCTYPE hTmL><html></html>
#errors
Line: 1 Col: 28 Unexpected end tag (html) after the (implied) root element.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
#data
<!DOCTYPE HTML><html></html>
#errors
Line: 1 Col: 28 Unexpected end tag (html) after the (implied) root element.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
#data
<body>X</body></body>
#errors
Line: 1 Col: 21 Unexpected end tag token (body) in the after body phase.
Line: 1 Col: 21 Unexpected EOF in inner html mode.
#document-fragment
html
#document
| <head>
| <body>
| "X"
#data
<div><p>a</x> b
#errors
Line: 1 Col: 5 Unexpected start tag (div). Expected DOCTYPE.
Line: 1 Col: 13 Unexpected end tag (x). Ignored.
Line: 1 Col: 15 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <div>
| <p>
| "a b"
#data
<table><tr><td><code></code> </table>
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <code>
| " "
#data
<table><b><tr><td>aaa</td></tr>bbb</table>ccc
#errors
XXX: Fix me
#document
| <html>
| <head>
| <body>
| <b>
| <b>
| "bbb"
| <table>
| <tbody>
| <tr>
| <td>
| "aaa"
| <b>
| "ccc"
#data
A<table><tr> B</tr> B</table>
#errors
XXX: Fix me
#document
| <html>
| <head>
| <body>
| "A B B"
| <table>
| <tbody>
| <tr>
#data
A<table><tr> B</tr> </em>C</table>
#errors
XXX: Fix me
#document
| <html>
| <head>
| <body>
| "A BC"
| <table>
| <tbody>
| <tr>
| " "
#data
<select><keygen>
#errors
Not known
#document
| <html>
| <head>
| <body>
| <select>
| <keygen>

View File

@@ -0,0 +1,148 @@
#data
<div>
<div></div>
</span>x
#errors
Line: 1 Col: 5 Unexpected start tag (div). Expected DOCTYPE.
Line: 3 Col: 7 Unexpected end tag (span). Ignored.
Line: 3 Col: 8 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <div>
| "
"
| <div>
| "
x"
#data
<div>x<div></div>
</span>x
#errors
Line: 1 Col: 5 Unexpected start tag (div). Expected DOCTYPE.
Line: 2 Col: 7 Unexpected end tag (span). Ignored.
Line: 2 Col: 8 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <div>
| "x"
| <div>
| "
x"
#data
<div>x<div></div>x</span>x
#errors
Line: 1 Col: 5 Unexpected start tag (div). Expected DOCTYPE.
Line: 1 Col: 25 Unexpected end tag (span). Ignored.
Line: 1 Col: 26 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <div>
| "x"
| <div>
| "xx"
#data
<div>x<div></div>y</span>z
#errors
Line: 1 Col: 5 Unexpected start tag (div). Expected DOCTYPE.
Line: 1 Col: 25 Unexpected end tag (span). Ignored.
Line: 1 Col: 26 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <div>
| "x"
| <div>
| "yz"
#data
<table><div>x<div></div>x</span>x
#errors
Line: 1 Col: 7 Unexpected start tag (table). Expected DOCTYPE.
Line: 1 Col: 12 Unexpected start tag (div) in table context caused voodoo mode.
Line: 1 Col: 18 Unexpected start tag (div) in table context caused voodoo mode.
Line: 1 Col: 24 Unexpected end tag (div) in table context caused voodoo mode.
Line: 1 Col: 32 Unexpected end tag (span) in table context caused voodoo mode.
Line: 1 Col: 32 Unexpected end tag (span). Ignored.
Line: 1 Col: 33 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| <div>
| "x"
| <div>
| "xx"
| <table>
#data
x<table>x
#errors
Line: 1 Col: 1 Unexpected non-space characters. Expected DOCTYPE.
Line: 1 Col: 9 Unexpected non-space characters in table context caused voodoo mode.
Line: 1 Col: 9 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| "xx"
| <table>
#data
x<table><table>x
#errors
Line: 1 Col: 1 Unexpected non-space characters. Expected DOCTYPE.
Line: 1 Col: 15 Unexpected start tag (table) implies end tag (table).
Line: 1 Col: 16 Unexpected non-space characters in table context caused voodoo mode.
Line: 1 Col: 16 Unexpected end of file. Expected table content.
#document
| <html>
| <head>
| <body>
| "x"
| <table>
| "x"
| <table>
#data
<b>a<div></div><div></b>y
#errors
Line: 1 Col: 3 Unexpected start tag (b). Expected DOCTYPE.
Line: 1 Col: 24 End tag (b) violates step 1, paragraph 3 of the adoption agency algorithm.
Line: 1 Col: 25 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <b>
| "a"
| <div>
| <div>
| <b>
| "y"
#data
<a><div><p></a>
#errors
Line: 1 Col: 3 Unexpected start tag (a). Expected DOCTYPE.
Line: 1 Col: 15 End tag (a) violates step 1, paragraph 3 of the adoption agency algorithm.
Line: 1 Col: 15 End tag (a) violates step 1, paragraph 3 of the adoption agency algorithm.
Line: 1 Col: 15 Expected closing tag. Unexpected end of file.
#document
| <html>
| <head>
| <body>
| <a>
| <div>
| <a>
| <p>
| <a>

View File

@@ -0,0 +1,457 @@
#data
<!DOCTYPE html><math></math>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
#data
<!DOCTYPE html><body><math></math>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
#data
<!DOCTYPE html><math><mi>
#errors
25: End of file in a foreign namespace context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <math mi>
#data
<!DOCTYPE html><math><annotation-xml><svg><u>
#errors
45: HTML start tag “u” in a foreign namespace context.
45: End of file seen and there were open elements.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <math annotation-xml>
| <svg svg>
| <u>
#data
<!DOCTYPE html><body><select><math></math></select>
#errors
Line: 1 Col: 35 Unexpected start tag token (math) in the select phase. Ignored.
Line: 1 Col: 42 Unexpected end tag (math) in the select phase. Ignored.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
#data
<!DOCTYPE html><body><select><option><math></math></option></select>
#errors
Line: 1 Col: 43 Unexpected start tag token (math) in the select phase. Ignored.
Line: 1 Col: 50 Unexpected end tag (math) in the select phase. Ignored.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| <option>
#data
<!DOCTYPE html><body><table><math></math></table>
#errors
Line: 1 Col: 34 Unexpected start tag (math) in table context caused voodoo mode.
Line: 1 Col: 41 Unexpected end tag (math) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <table>
#data
<!DOCTYPE html><body><table><math><mi>foo</mi></math></table>
#errors
Line: 1 Col: 34 Unexpected start tag (math) in table context caused voodoo mode.
Line: 1 Col: 46 Unexpected end tag (mi) in table context caused voodoo mode.
Line: 1 Col: 53 Unexpected end tag (math) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <math mi>
| "foo"
| <table>
#data
<!DOCTYPE html><body><table><math><mi>foo</mi><mi>bar</mi></math></table>
#errors
Line: 1 Col: 34 Unexpected start tag (math) in table context caused voodoo mode.
Line: 1 Col: 46 Unexpected end tag (mi) in table context caused voodoo mode.
Line: 1 Col: 58 Unexpected end tag (mi) in table context caused voodoo mode.
Line: 1 Col: 65 Unexpected end tag (math) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <math mi>
| "foo"
| <math mi>
| "bar"
| <table>
#data
<!DOCTYPE html><body><table><tbody><math><mi>foo</mi><mi>bar</mi></math></tbody></table>
#errors
Line: 1 Col: 41 Unexpected start tag (math) in table context caused voodoo mode.
Line: 1 Col: 53 Unexpected end tag (mi) in table context caused voodoo mode.
Line: 1 Col: 65 Unexpected end tag (mi) in table context caused voodoo mode.
Line: 1 Col: 72 Unexpected end tag (math) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <math mi>
| "foo"
| <math mi>
| "bar"
| <table>
| <tbody>
#data
<!DOCTYPE html><body><table><tbody><tr><math><mi>foo</mi><mi>bar</mi></math></tr></tbody></table>
#errors
Line: 1 Col: 45 Unexpected start tag (math) in table context caused voodoo mode.
Line: 1 Col: 57 Unexpected end tag (mi) in table context caused voodoo mode.
Line: 1 Col: 69 Unexpected end tag (mi) in table context caused voodoo mode.
Line: 1 Col: 76 Unexpected end tag (math) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <math mi>
| "foo"
| <math mi>
| "bar"
| <table>
| <tbody>
| <tr>
#data
<!DOCTYPE html><body><table><tbody><tr><td><math><mi>foo</mi><mi>bar</mi></math></td></tr></tbody></table>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <math math>
| <math mi>
| "foo"
| <math mi>
| "bar"
#data
<!DOCTYPE html><body><table><tbody><tr><td><math><mi>foo</mi><mi>bar</mi></math><p>baz</td></tr></tbody></table>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <math math>
| <math mi>
| "foo"
| <math mi>
| "bar"
| <p>
| "baz"
#data
<!DOCTYPE html><body><table><caption><math><mi>foo</mi><mi>bar</mi></math><p>baz</caption></table>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <caption>
| <math math>
| <math mi>
| "foo"
| <math mi>
| "bar"
| <p>
| "baz"
#data
<!DOCTYPE html><body><table><caption><math><mi>foo</mi><mi>bar</mi><p>baz</table><p>quux
#errors
Line: 1 Col: 70 HTML start tag "p" in a foreign namespace context.
Line: 1 Col: 81 Unexpected end table tag in caption. Generates implied end caption.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <caption>
| <math math>
| <math mi>
| "foo"
| <math mi>
| "bar"
| <p>
| "baz"
| <p>
| "quux"
#data
<!DOCTYPE html><body><table><caption><math><mi>foo</mi><mi>bar</mi>baz</table><p>quux
#errors
Line: 1 Col: 78 Unexpected end table tag in caption. Generates implied end caption.
Line: 1 Col: 78 Unexpected end tag (caption). Missing end tag (math).
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <caption>
| <math math>
| <math mi>
| "foo"
| <math mi>
| "bar"
| "baz"
| <p>
| "quux"
#data
<!DOCTYPE html><body><table><colgroup><math><mi>foo</mi><mi>bar</mi><p>baz</table><p>quux
#errors
Line: 1 Col: 44 Unexpected start tag (math) in table context caused voodoo mode.
Line: 1 Col: 56 Unexpected end tag (mi) in table context caused voodoo mode.
Line: 1 Col: 68 Unexpected end tag (mi) in table context caused voodoo mode.
Line: 1 Col: 71 HTML start tag "p" in a foreign namespace context.
Line: 1 Col: 71 Unexpected start tag (p) in table context caused voodoo mode.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <math mi>
| "foo"
| <math mi>
| "bar"
| <p>
| "baz"
| <table>
| <colgroup>
| <p>
| "quux"
#data
<!DOCTYPE html><body><table><tr><td><select><math><mi>foo</mi><mi>bar</mi><p>baz</table><p>quux
#errors
Line: 1 Col: 50 Unexpected start tag token (math) in the select phase. Ignored.
Line: 1 Col: 54 Unexpected start tag token (mi) in the select phase. Ignored.
Line: 1 Col: 62 Unexpected end tag (mi) in the select phase. Ignored.
Line: 1 Col: 66 Unexpected start tag token (mi) in the select phase. Ignored.
Line: 1 Col: 74 Unexpected end tag (mi) in the select phase. Ignored.
Line: 1 Col: 77 Unexpected start tag token (p) in the select phase. Ignored.
Line: 1 Col: 88 Unexpected table element end tag (tables) in the select in table phase.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <select>
| "foobarbaz"
| <p>
| "quux"
#data
<!DOCTYPE html><body><table><select><math><mi>foo</mi><mi>bar</mi><p>baz</table><p>quux
#errors
Line: 1 Col: 36 Unexpected start tag (select) in table context caused voodoo mode.
Line: 1 Col: 42 Unexpected start tag token (math) in the select phase. Ignored.
Line: 1 Col: 46 Unexpected start tag token (mi) in the select phase. Ignored.
Line: 1 Col: 54 Unexpected end tag (mi) in the select phase. Ignored.
Line: 1 Col: 58 Unexpected start tag token (mi) in the select phase. Ignored.
Line: 1 Col: 66 Unexpected end tag (mi) in the select phase. Ignored.
Line: 1 Col: 69 Unexpected start tag token (p) in the select phase. Ignored.
Line: 1 Col: 80 Unexpected table element end tag (tables) in the select in table phase.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <select>
| "foobarbaz"
| <table>
| <p>
| "quux"
#data
<!DOCTYPE html><body></body></html><math><mi>foo</mi><mi>bar</mi><p>baz
#errors
Line: 1 Col: 41 Unexpected start tag (math).
Line: 1 Col: 68 HTML start tag "p" in a foreign namespace context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <math mi>
| "foo"
| <math mi>
| "bar"
| <p>
| "baz"
#data
<!DOCTYPE html><body></body><math><mi>foo</mi><mi>bar</mi><p>baz
#errors
Line: 1 Col: 34 Unexpected start tag token (math) in the after body phase.
Line: 1 Col: 61 HTML start tag "p" in a foreign namespace context.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <math math>
| <math mi>
| "foo"
| <math mi>
| "bar"
| <p>
| "baz"
#data
<!DOCTYPE html><frameset><math><mi></mi><mi></mi><p><span>
#errors
Line: 1 Col: 31 Unexpected start tag token (math) in the frameset phase. Ignored.
Line: 1 Col: 35 Unexpected start tag token (mi) in the frameset phase. Ignored.
Line: 1 Col: 40 Unexpected end tag token (mi) in the frameset phase. Ignored.
Line: 1 Col: 44 Unexpected start tag token (mi) in the frameset phase. Ignored.
Line: 1 Col: 49 Unexpected end tag token (mi) in the frameset phase. Ignored.
Line: 1 Col: 52 Unexpected start tag token (p) in the frameset phase. Ignored.
Line: 1 Col: 58 Unexpected start tag token (span) in the frameset phase. Ignored.
Line: 1 Col: 58 Expected closing tag. Unexpected end of file.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <frameset>
#data
<!DOCTYPE html><frameset></frameset><math><mi></mi><mi></mi><p><span>
#errors
Line: 1 Col: 42 Unexpected start tag (math) in the after frameset phase. Ignored.
Line: 1 Col: 46 Unexpected start tag (mi) in the after frameset phase. Ignored.
Line: 1 Col: 51 Unexpected end tag (mi) in the after frameset phase. Ignored.
Line: 1 Col: 55 Unexpected start tag (mi) in the after frameset phase. Ignored.
Line: 1 Col: 60 Unexpected end tag (mi) in the after frameset phase. Ignored.
Line: 1 Col: 63 Unexpected start tag (p) in the after frameset phase. Ignored.
Line: 1 Col: 69 Unexpected start tag (span) in the after frameset phase. Ignored.
#document
| <!DOCTYPE html>
| <html>
| <head>
| <frameset>
#data
<!DOCTYPE html><body xlink:href=foo><math xlink:href=foo></math>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| xlink:href="foo"
| <math math>
| xlink href="foo"
#data
<!DOCTYPE html><body xlink:href=foo xml:lang=en><math><mi xml:lang=en xlink:href=foo></mi></math>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| xlink:href="foo"
| xml:lang="en"
| <math math>
| <math mi>
| xlink href="foo"
| xml lang="en"
#data
<!DOCTYPE html><body xlink:href=foo xml:lang=en><math><mi xml:lang=en xlink:href=foo /></math>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| xlink:href="foo"
| xml:lang="en"
| <math math>
| <math mi>
| xlink href="foo"
| xml lang="en"
#data
<!DOCTYPE html><body xlink:href=foo xml:lang=en><math><mi xml:lang=en xlink:href=foo />bar</math>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| xlink:href="foo"
| xml:lang="en"
| <math math>
| <math mi>
| xlink href="foo"
| xml lang="en"
| "bar"

View File

@@ -0,0 +1,741 @@
#data
<body><span>
#errors
#document-fragment
body
#document
| <span>
#data
<span><body>
#errors
#document-fragment
body
#document
| <span>
#data
<span><body>
#errors
#document-fragment
div
#document
| <span>
#data
<body><span>
#errors
#document-fragment
html
#document
| <head>
| <body>
| <span>
#data
<frameset><span>
#errors
#document-fragment
body
#document
| <span>
#data
<span><frameset>
#errors
#document-fragment
body
#document
| <span>
#data
<span><frameset>
#errors
#document-fragment
div
#document
| <span>
#data
<frameset><span>
#errors
#document-fragment
html
#document
| <head>
| <frameset>
#data
<table><tr>
#errors
#document-fragment
table
#document
| <tbody>
| <tr>
#data
</table><tr>
#errors
#document-fragment
table
#document
| <tbody>
| <tr>
#data
<a>
#errors
#document-fragment
table
#document
| <a>
#data
<a>
#errors
#document-fragment
table
#document
| <a>
#data
<a><caption>a
#errors
#document-fragment
table
#document
| <a>
| <caption>
| "a"
#data
<a><colgroup><col>
#errors
#document-fragment
table
#document
| <a>
| <colgroup>
| <col>
#data
<a><tbody><tr>
#errors
#document-fragment
table
#document
| <a>
| <tbody>
| <tr>
#data
<a><tfoot><tr>
#errors
#document-fragment
table
#document
| <a>
| <tfoot>
| <tr>
#data
<a><thead><tr>
#errors
#document-fragment
table
#document
| <a>
| <thead>
| <tr>
#data
<a><tr>
#errors
#document-fragment
table
#document
| <a>
| <tbody>
| <tr>
#data
<a><th>
#errors
#document-fragment
table
#document
| <a>
| <tbody>
| <tr>
| <th>
#data
<a><td>
#errors
#document-fragment
table
#document
| <a>
| <tbody>
| <tr>
| <td>
#data
<table></table><tbody>
#errors
#document-fragment
caption
#document
| <table>
#data
</table><span>
#errors
#document-fragment
caption
#document
| <span>
#data
<span></table>
#errors
#document-fragment
caption
#document
| <span>
#data
</caption><span>
#errors
#document-fragment
caption
#document
| <span>
#data
<span></caption><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
<span><caption><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
<span><col><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
<span><colgroup><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
<span><html><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
<span><tbody><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
<span><td><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
<span><tfoot><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
<span><thead><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
<span><th><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
<span><tr><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
<span></table><span>
#errors
#document-fragment
caption
#document
| <span>
| <span>
#data
</colgroup><col>
#errors
#document-fragment
colgroup
#document
| <col>
#data
<a><col>
#errors
#document-fragment
colgroup
#document
| <col>
#data
<caption><a>
#errors
#document-fragment
tbody
#document
| <a>
#data
<col><a>
#errors
#document-fragment
tbody
#document
| <a>
#data
<colgroup><a>
#errors
#document-fragment
tbody
#document
| <a>
#data
<tbody><a>
#errors
#document-fragment
tbody
#document
| <a>
#data
<tfoot><a>
#errors
#document-fragment
tbody
#document
| <a>
#data
<thead><a>
#errors
#document-fragment
tbody
#document
| <a>
#data
</table><a>
#errors
#document-fragment
tbody
#document
| <a>
#data
<a><tr>
#errors
#document-fragment
tbody
#document
| <a>
| <tr>
#data
<a><td>
#errors
#document-fragment
tbody
#document
| <a>
| <tr>
| <td>
#data
<a><td>
#errors
#document-fragment
tbody
#document
| <a>
| <tr>
| <td>
#data
<a><td>
#errors
#document-fragment
tbody
#document
| <a>
| <tr>
| <td>
#data
<td><table><tbody><a><tr>
#errors
#document-fragment
tbody
#document
| <tr>
| <td>
| <a>
| <table>
| <tbody>
| <tr>
#data
</tr><td>
#errors
#document-fragment
tr
#document
| <td>
#data
<td><table><a><tr></tr><tr>
#errors
#document-fragment
tr
#document
| <td>
| <a>
| <table>
| <tbody>
| <tr>
| <tr>
#data
<caption><td>
#errors
#document-fragment
tr
#document
| <td>
#data
<col><td>
#errors
#document-fragment
tr
#document
| <td>
#data
<colgroup><td>
#errors
#document-fragment
tr
#document
| <td>
#data
<tbody><td>
#errors
#document-fragment
tr
#document
| <td>
#data
<tfoot><td>
#errors
#document-fragment
tr
#document
| <td>
#data
<thead><td>
#errors
#document-fragment
tr
#document
| <td>
#data
<tr><td>
#errors
#document-fragment
tr
#document
| <td>
#data
</table><td>
#errors
#document-fragment
tr
#document
| <td>
#data
<td><table></table><td>
#errors
#document-fragment
tr
#document
| <td>
| <table>
| <td>
#data
<td><table></table><td>
#errors
#document-fragment
tr
#document
| <td>
| <table>
| <td>
#data
<caption><a>
#errors
#document-fragment
td
#document
| <a>
#data
<col><a>
#errors
#document-fragment
td
#document
| <a>
#data
<colgroup><a>
#errors
#document-fragment
td
#document
| <a>
#data
<tbody><a>
#errors
#document-fragment
td
#document
| <a>
#data
<tfoot><a>
#errors
#document-fragment
td
#document
| <a>
#data
<th><a>
#errors
#document-fragment
td
#document
| <a>
#data
<thead><a>
#errors
#document-fragment
td
#document
| <a>
#data
<tr><a>
#errors
#document-fragment
td
#document
| <a>
#data
</table><a>
#errors
#document-fragment
td
#document
| <a>
#data
</tbody><a>
#errors
#document-fragment
td
#document
| <a>
#data
</td><a>
#errors
#document-fragment
td
#document
| <a>
#data
</tfoot><a>
#errors
#document-fragment
td
#document
| <a>
#data
</thead><a>
#errors
#document-fragment
td
#document
| <a>
#data
</th><a>
#errors
#document-fragment
td
#document
| <a>
#data
</tr><a>
#errors
#document-fragment
td
#document
| <a>
#data
<table><td><td>
#errors
#document-fragment
td
#document
| <table>
| <tbody>
| <tr>
| <td>
| <td>
#data
</select><option>
#errors
#document-fragment
select
#document
| <option>
#data
<input><option>
#errors
#document-fragment
select
#document
| <option>
#data
<keygen><option>
#errors
#document-fragment
select
#document
| <option>
#data
<textarea><option>
#errors
#document-fragment
select
#document
| <option>
#data
</html><!--abc-->
#errors
#document-fragment
html
#document
| <head>
| <body>
| <!-- abc -->
#data
</frameset><frame>
#errors
#document-fragment
frameset
#document
| <frame>
#data
#errors
#document-fragment
html
#document
| <head>
| <body>

View File

@@ -0,0 +1,261 @@
#data
<b><p>Bold </b> Not bold</p>
Also not bold.
#errors
#document
| <html>
| <head>
| <body>
| <b>
| <p>
| <b>
| "Bold "
| " Not bold"
| "
Also not bold."
#data
<html>
<font color=red><i>Italic and Red<p>Italic and Red </font> Just italic.</p> Italic only.</i> Plain
<p>I should not be red. <font color=red>Red. <i>Italic and red.</p>
<p>Italic and red. </i> Red.</font> I should not be red.</p>
<b>Bold <i>Bold and italic</b> Only Italic </i> Plain
#errors
#document
| <html>
| <head>
| <body>
| <font>
| color="red"
| <i>
| "Italic and Red"
| <i>
| <p>
| <font>
| color="red"
| "Italic and Red "
| " Just italic."
| " Italic only."
| " Plain
"
| <p>
| "I should not be red. "
| <font>
| color="red"
| "Red. "
| <i>
| "Italic and red."
| <font>
| color="red"
| <i>
| "
"
| <p>
| <font>
| color="red"
| <i>
| "Italic and red. "
| " Red."
| " I should not be red."
| "
"
| <b>
| "Bold "
| <i>
| "Bold and italic"
| <i>
| " Only Italic "
| " Plain"
#data
<html><body>
<p><font size="7">First paragraph.</p>
<p>Second paragraph.</p></font>
<b><p><i>Bold and Italic</b> Italic</p>
#errors
#document
| <html>
| <head>
| <body>
| "
"
| <p>
| <font>
| size="7"
| "First paragraph."
| <font>
| size="7"
| "
"
| <p>
| "Second paragraph."
| "
"
| <b>
| <p>
| <b>
| <i>
| "Bold and Italic"
| <i>
| " Italic"
#data
<html>
<dl>
<dt><b>Boo
<dd>Goo?
</dl>
</html>
#errors
#document
| <html>
| <head>
| <body>
| <dl>
| "
"
| <dt>
| <b>
| "Boo
"
| <dd>
| <b>
| "Goo?
"
| <b>
| "
"
#data
<html><body>
<label><a><div>Hello<div>World</div></a></label>
</body></html>
#errors
#document
| <html>
| <head>
| <body>
| "
"
| <label>
| <a>
| <div>
| <a>
| "Hello"
| <div>
| "World"
| "
"
#data
<table><center> <font>a</center> <img> <tr><td> </td> </tr> </table>
#errors
#document
| <html>
| <head>
| <body>
| <center>
| " "
| <font>
| "a"
| <font>
| <img>
| " "
| <table>
| " "
| <tbody>
| <tr>
| <td>
| " "
| " "
| " "
#data
<table><tr><p><a><p>You should see this text.
#errors
#document
| <html>
| <head>
| <body>
| <p>
| <a>
| <p>
| <a>
| "You should see this text."
| <table>
| <tbody>
| <tr>
#data
<TABLE>
<TR>
<CENTER><CENTER><TD></TD></TR><TR>
<FONT>
<TABLE><tr></tr></TABLE>
</P>
<a></font><font></a>
This page contains an insanely badly-nested tag sequence.
#errors
#document
| <html>
| <head>
| <body>
| <center>
| <center>
| <font>
| "
"
| <table>
| "
"
| <tbody>
| <tr>
| "
"
| <td>
| <tr>
| "
"
| <table>
| <tbody>
| <tr>
| <font>
| "
"
| <p>
| "
"
| <a>
| <a>
| <font>
| <font>
| "
This page contains an insanely badly-nested tag sequence."
#data
<html>
<body>
<b><nobr><div>This text is in a div inside a nobr</nobr>More text that should not be in the nobr, i.e., the
nobr should have closed the div inside it implicitly. </b><pre>A pre tag outside everything else.</pre>
</body>
</html>
#errors
#document
| <html>
| <head>
| <body>
| "
"
| <b>
| <nobr>
| <div>
| <b>
| <nobr>
| "This text is in a div inside a nobr"
| "More text that should not be in the nobr, i.e., the
nobr should have closed the div inside it implicitly. "
| <pre>
| "A pre tag outside everything else."
| "
"

View File

@@ -0,0 +1,610 @@
#data
Test
#errors
Line: 1 Col: 4 Unexpected non-space characters. Expected DOCTYPE.
#document
| <html>
| <head>
| <body>
| "Test"
#data
<div></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
#data
<div>Test</div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| "Test"
#data
<di
#errors
#document
| <html>
| <head>
| <body>
#data
<div>Hello</div>
<script>
console.log("PASS");
</script>
<div>Bye</div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| "Hello"
| "
"
| <script>
| "
console.log("PASS");
"
| "
"
| <div>
| "Bye"
#data
<div foo="bar">Hello</div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| foo="bar"
| "Hello"
#data
<div>Hello</div>
<script>
console.log("FOO<span>BAR</span>BAZ");
</script>
<div>Bye</div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| "Hello"
| "
"
| <script>
| "
console.log("FOO<span>BAR</span>BAZ");
"
| "
"
| <div>
| "Bye"
#data
<foo bar="baz"></foo><potato quack="duck"></potato>
#errors
#document
| <html>
| <head>
| <body>
| <foo>
| bar="baz"
| <potato>
| quack="duck"
#data
<foo bar="baz"><potato quack="duck"></potato></foo>
#errors
#document
| <html>
| <head>
| <body>
| <foo>
| bar="baz"
| <potato>
| quack="duck"
#data
<foo></foo bar="baz"><potato></potato quack="duck">
#errors
#document
| <html>
| <head>
| <body>
| <foo>
| <potato>
#data
</ tttt>
#errors
#document
| <!-- tttt -->
| <html>
| <head>
| <body>
#data
<div FOO ><img><img></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| foo=""
| <img>
| <img>
#data
<p>Test</p<p>Test2</p>
#errors
#document
| <html>
| <head>
| <body>
| <p>
| "TestTest2"
#data
<rdar://problem/6869687>
#errors
#document
| <html>
| <head>
| <body>
| <rdar:>
| 6869687=""
| problem=""
#data
<A>test< /A>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| "test< /A>"
#data
&lt;
#errors
#document
| <html>
| <head>
| <body>
| "<"
#data
<body foo='bar'><body foo='baz' yo='mama'>
#errors
#document
| <html>
| <head>
| <body>
| foo="bar"
| yo="mama"
#data
<body></br foo="bar"></body>
#errors
#document
| <html>
| <head>
| <body>
| <br>
#data
<bdy><br foo="bar"></body>
#errors
#document
| <html>
| <head>
| <body>
| <bdy>
| <br>
| foo="bar"
#data
<body></body></br foo="bar">
#errors
#document
| <html>
| <head>
| <body>
| <br>
#data
<bdy></body><br foo="bar">
#errors
#document
| <html>
| <head>
| <body>
| <bdy>
| <br>
| foo="bar"
#data
<html><body></body></html><!-- Hi there -->
#errors
#document
| <html>
| <head>
| <body>
| <!-- Hi there -->
#data
<html><body></body></html>x<!-- Hi there -->
#errors
#document
| <html>
| <head>
| <body>
| "x"
| <!-- Hi there -->
#data
<html><body></body></html>x<!-- Hi there --></html><!-- Again -->
#errors
#document
| <html>
| <head>
| <body>
| "x"
| <!-- Hi there -->
| <!-- Again -->
#data
<html><body></body></html>x<!-- Hi there --></body></html><!-- Again -->
#errors
#document
| <html>
| <head>
| <body>
| "x"
| <!-- Hi there -->
| <!-- Again -->
#data
<html><body><ruby><div><rp>xx</rp></div></ruby></body></html>
#errors
#document
| <html>
| <head>
| <body>
| <ruby>
| <div>
| <rp>
| "xx"
#data
<html><body><ruby><div><rt>xx</rt></div></ruby></body></html>
#errors
#document
| <html>
| <head>
| <body>
| <ruby>
| <div>
| <rt>
| "xx"
#data
<html><frameset><!--1--><noframes>A</noframes><!--2--></frameset><!--3--><noframes>B</noframes><!--4--></html><!--5--><noframes>C</noframes><!--6-->
#errors
#document
| <html>
| <head>
| <frameset>
| <!-- 1 -->
| <noframes>
| "A"
| <!-- 2 -->
| <!-- 3 -->
| <noframes>
| "B"
| <!-- 4 -->
| <noframes>
| "C"
| <!-- 5 -->
| <!-- 6 -->
#data
<select><option>A<select><option>B<select><option>C<select><option>D<select><option>E<select><option>F<select><option>G<select>
#errors
#document
| <html>
| <head>
| <body>
| <select>
| <option>
| "A"
| <option>
| "B"
| <select>
| <option>
| "C"
| <option>
| "D"
| <select>
| <option>
| "E"
| <option>
| "F"
| <select>
| <option>
| "G"
#data
<dd><dd><dt><dt><dd><li><li>
#errors
#document
| <html>
| <head>
| <body>
| <dd>
| <dd>
| <dt>
| <dt>
| <dd>
| <li>
| <li>
#data
<div><b></div><div><nobr>a<nobr>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| <b>
| <div>
| <b>
| <nobr>
| "a"
| <nobr>
#data
<head></head>
<body></body>
#errors
#document
| <html>
| <head>
| "
"
| <body>
#data
<head></head> <style></style>ddd
#errors
#document
| <html>
| <head>
| <style>
| " "
| <body>
| "ddd"
#data
<kbd><table></kbd><col><select><tr>
#errors
#document
| <html>
| <head>
| <body>
| <kbd>
| <select>
| <table>
| <colgroup>
| <col>
| <tbody>
| <tr>
#data
<kbd><table></kbd><col><select><tr></table><div>
#errors
#document
| <html>
| <head>
| <body>
| <kbd>
| <select>
| <table>
| <colgroup>
| <col>
| <tbody>
| <tr>
| <div>
#data
<a><li><style></style><title></title></a>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| <li>
| <a>
| <style>
| <title>
#data
<font></p><p><meta><title></title></font>
#errors
#document
| <html>
| <head>
| <body>
| <font>
| <p>
| <p>
| <font>
| <meta>
| <title>
#data
<a><center><title></title><a>
#errors
#document
| <html>
| <head>
| <body>
| <a>
| <center>
| <a>
| <title>
| <a>
#data
<svg><title><div>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| <svg title>
| <div>
#data
<svg><title><rect><div>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| <svg title>
| <rect>
| <div>
#data
<svg><title><svg><div>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| <svg title>
| <svg svg>
| <div>
#data
<img <="" FAIL>
#errors
#document
| <html>
| <head>
| <body>
| <img>
| <=""
| fail=""
#data
<ul><li><div id='foo'/>A</li><li>B<div>C</div></li></ul>
#errors
#document
| <html>
| <head>
| <body>
| <ul>
| <li>
| <div>
| id="foo"
| "A"
| <li>
| "B"
| <div>
| "C"
#data
<svg><em><desc></em>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| <em>
| <desc>
#data
<table><tr><td><svg><desc><td></desc><circle>
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| <svg svg>
| <svg desc>
| <td>
| <circle>
#data
<svg><tfoot></mi><td>
#errors
#document
| <html>
| <head>
| <body>
| <svg svg>
| <svg tfoot>
| <svg td>
#data
<math><mrow><mrow><mn>1</mn></mrow><mi>a</mi></mrow></math>
#errors
#document
| <html>
| <head>
| <body>
| <math math>
| <math mrow>
| <math mrow>
| <math mn>
| "1"
| <math mi>
| "a"
#data
<!doctype html><input type="hidden"><frameset>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <frameset>
#data
<!doctype html><input type="button"><frameset>
#errors
#document
| <!DOCTYPE html>
| <html>
| <head>
| <body>
| <input>
| type="button"

View File

@@ -0,0 +1,159 @@
#data
<foo bar=qux/>
#errors
#document
| <html>
| <head>
| <body>
| <foo>
| bar="qux/"
#data
<p id="status"><noscript><strong>A</strong></noscript><span>B</span></p>
#errors
#document
| <html>
| <head>
| <body>
| <p>
| id="status"
| <noscript>
| "<strong>A</strong>"
| <span>
| "B"
#data
<div><sarcasm><div></div></sarcasm></div>
#errors
#document
| <html>
| <head>
| <body>
| <div>
| <sarcasm>
| <div>
#data
<html><body><img src="" border="0" alt="><div>A</div></body></html>
#errors
#document
| <html>
| <head>
| <body>
#data
<table><td></tbody>A
#errors
#document
| <html>
| <head>
| <body>
| "A"
| <table>
| <tbody>
| <tr>
| <td>
#data
<table><td></thead>A
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| "A"
#data
<table><td></tfoot>A
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <tbody>
| <tr>
| <td>
| "A"
#data
<table><thead><td></tbody>A
#errors
#document
| <html>
| <head>
| <body>
| <table>
| <thead>
| <tr>
| <td>
| "A"
#data
<legend>test</legend>
#errors
#document
| <html>
| <head>
| <body>
| <legend>
| "test"
#data
<table><input>
#errors
#document
| <html>
| <head>
| <body>
| <input>
| <table>
#data
<b><em><dcell><postfield><postfield><postfield><postfield><missing_glyph><missing_glyph><missing_glyph><missing_glyph><hkern><aside></b></em>
#errors
#document-fragment
div
#document
| <b>
| <em>
| <dcell>
| <postfield>
| <postfield>
| <postfield>
| <postfield>
| <missing_glyph>
| <missing_glyph>
| <missing_glyph>
| <missing_glyph>
| <hkern>
| <aside>
| <em>
| <b>
#data
<isindex action="x">
#errors
#document-fragment
table
#document
| <form>
| action="x"
| <hr>
| <label>
| "This is a searchable index. Enter search keywords: "
| <input>
| name="isindex"
| <hr>
#data
<option><XH<optgroup></optgroup>
#errors
#document-fragment
select
#document
| <option>

View File

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,748 @@
// Copyright 2010 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package html
import (
"bytes"
"io"
"io/ioutil"
"reflect"
"runtime"
"strings"
"testing"
)
type tokenTest struct {
// A short description of the test case.
desc string
// The HTML to parse.
html string
// The string representations of the expected tokens, joined by '$'.
golden string
}
var tokenTests = []tokenTest{
{
"empty",
"",
"",
},
// A single text node. The tokenizer should not break text nodes on whitespace,
// nor should it normalize whitespace within a text node.
{
"text",
"foo bar",
"foo bar",
},
// An entity.
{
"entity",
"one &lt; two",
"one &lt; two",
},
// A start, self-closing and end tag. The tokenizer does not care if the start
// and end tokens don't match; that is the job of the parser.
{
"tags",
"<a>b<c/>d</e>",
"<a>$b$<c/>$d$</e>",
},
// Angle brackets that aren't a tag.
{
"not a tag #0",
"<",
"&lt;",
},
{
"not a tag #1",
"</",
"&lt;/",
},
{
"not a tag #2",
"</>",
"<!---->",
},
{
"not a tag #3",
"a</>b",
"a$<!---->$b",
},
{
"not a tag #4",
"</ >",
"<!-- -->",
},
{
"not a tag #5",
"</.",
"<!--.-->",
},
{
"not a tag #6",
"</.>",
"<!--.-->",
},
{
"not a tag #7",
"a < b",
"a &lt; b",
},
{
"not a tag #8",
"<.>",
"&lt;.&gt;",
},
{
"not a tag #9",
"a<<<b>>>c",
"a&lt;&lt;$<b>$&gt;&gt;c",
},
{
"not a tag #10",
"if x<0 and y < 0 then x*y>0",
"if x&lt;0 and y &lt; 0 then x*y&gt;0",
},
{
"not a tag #11",
"<<p>",
"&lt;$<p>",
},
// EOF in a tag name.
{
"tag name eof #0",
"<a",
"",
},
{
"tag name eof #1",
"<a ",
"",
},
{
"tag name eof #2",
"a<b",
"a",
},
{
"tag name eof #3",
"<a><b",
"<a>",
},
{
"tag name eof #4",
`<a x`,
``,
},
// Some malformed tags that are missing a '>'.
{
"malformed tag #0",
`<p</p>`,
`<p< p="">`,
},
{
"malformed tag #1",
`<p </p>`,
`<p <="" p="">`,
},
{
"malformed tag #2",
`<p id`,
``,
},
{
"malformed tag #3",
`<p id=`,
``,
},
{
"malformed tag #4",
`<p id=>`,
`<p id="">`,
},
{
"malformed tag #5",
`<p id=0`,
``,
},
{
"malformed tag #6",
`<p id=0</p>`,
`<p id="0&lt;/p">`,
},
{
"malformed tag #7",
`<p id="0</p>`,
``,
},
{
"malformed tag #8",
`<p id="0"</p>`,
`<p id="0" <="" p="">`,
},
{
"malformed tag #9",
`<p></p id`,
`<p>`,
},
// Raw text and RCDATA.
{
"basic raw text",
"<script><a></b></script>",
"<script>$&lt;a&gt;&lt;/b&gt;$</script>",
},
{
"unfinished script end tag",
"<SCRIPT>a</SCR",
"<script>$a&lt;/SCR",
},
{
"broken script end tag",
"<SCRIPT>a</SCR ipt>",
"<script>$a&lt;/SCR ipt&gt;",
},
{
"EOF in script end tag",
"<SCRIPT>a</SCRipt",
"<script>$a&lt;/SCRipt",
},
{
"scriptx end tag",
"<SCRIPT>a</SCRiptx",
"<script>$a&lt;/SCRiptx",
},
{
"' ' completes script end tag",
"<SCRIPT>a</SCRipt ",
"<script>$a",
},
{
"'>' completes script end tag",
"<SCRIPT>a</SCRipt>",
"<script>$a$</script>",
},
{
"self-closing script end tag",
"<SCRIPT>a</SCRipt/>",
"<script>$a$</script>",
},
{
"nested script tag",
"<SCRIPT>a</SCRipt<script>",
"<script>$a&lt;/SCRipt&lt;script&gt;",
},
{
"script end tag after unfinished",
"<SCRIPT>a</SCRipt</script>",
"<script>$a&lt;/SCRipt$</script>",
},
{
"script/style mismatched tags",
"<script>a</style>",
"<script>$a&lt;/style&gt;",
},
{
"style element with entity",
"<style>&apos;",
"<style>$&amp;apos;",
},
{
"textarea with tag",
"<textarea><div></textarea>",
"<textarea>$&lt;div&gt;$</textarea>",
},
{
"title with tag and entity",
"<title><b>K&amp;R C</b></title>",
"<title>$&lt;b&gt;K&amp;R C&lt;/b&gt;$</title>",
},
// DOCTYPE tests.
{
"Proper DOCTYPE",
"<!DOCTYPE html>",
"<!DOCTYPE html>",
},
{
"DOCTYPE with no space",
"<!doctypehtml>",
"<!DOCTYPE html>",
},
{
"DOCTYPE with two spaces",
"<!doctype html>",
"<!DOCTYPE html>",
},
{
"looks like DOCTYPE but isn't",
"<!DOCUMENT html>",
"<!--DOCUMENT html-->",
},
{
"DOCTYPE at EOF",
"<!DOCtype",
"<!DOCTYPE >",
},
// XML processing instructions.
{
"XML processing instruction",
"<?xml?>",
"<!--?xml?-->",
},
// Comments.
{
"comment0",
"abc<b><!-- skipme --></b>def",
"abc$<b>$<!-- skipme -->$</b>$def",
},
{
"comment1",
"a<!-->z",
"a$<!---->$z",
},
{
"comment2",
"a<!--->z",
"a$<!---->$z",
},
{
"comment3",
"a<!--x>-->z",
"a$<!--x>-->$z",
},
{
"comment4",
"a<!--x->-->z",
"a$<!--x->-->$z",
},
{
"comment5",
"a<!>z",
"a$<!---->$z",
},
{
"comment6",
"a<!->z",
"a$<!----->$z",
},
{
"comment7",
"a<!---<>z",
"a$<!---<>z-->",
},
{
"comment8",
"a<!--z",
"a$<!--z-->",
},
{
"comment9",
"a<!--z-",
"a$<!--z-->",
},
{
"comment10",
"a<!--z--",
"a$<!--z-->",
},
{
"comment11",
"a<!--z---",
"a$<!--z--->",
},
{
"comment12",
"a<!--z----",
"a$<!--z---->",
},
{
"comment13",
"a<!--x--!>z",
"a$<!--x-->$z",
},
// An attribute with a backslash.
{
"backslash",
`<p id="a\"b">`,
`<p id="a\" b"="">`,
},
// Entities, tag name and attribute key lower-casing, and whitespace
// normalization within a tag.
{
"tricky",
"<p \t\n iD=\"a&quot;B\" foo=\"bar\"><EM>te&lt;&amp;;xt</em></p>",
`<p id="a&#34;B" foo="bar">$<em>$te&lt;&amp;;xt$</em>$</p>`,
},
// A nonexistent entity. Tokenizing and converting back to a string should
// escape the "&" to become "&amp;".
{
"noSuchEntity",
`<a b="c&noSuchEntity;d">&lt;&alsoDoesntExist;&`,
`<a b="c&amp;noSuchEntity;d">$&lt;&amp;alsoDoesntExist;&amp;`,
},
{
"entity without semicolon",
`&notit;&notin;<a b="q=z&amp=5&notice=hello&not;=world">`,
`¬it;∉$<a b="q=z&amp;amp=5&amp;notice=hello¬=world">`,
},
{
"entity with digits",
"&frac12;",
"½",
},
// Attribute tests:
// http://dev.w3.org/html5/spec/Overview.html#attributes-0
{
"Empty attribute",
`<input disabled FOO>`,
`<input disabled="" foo="">`,
},
{
"Empty attribute, whitespace",
`<input disabled FOO >`,
`<input disabled="" foo="">`,
},
{
"Unquoted attribute value",
`<input value=yes FOO=BAR>`,
`<input value="yes" foo="BAR">`,
},
{
"Unquoted attribute value, spaces",
`<input value = yes FOO = BAR>`,
`<input value="yes" foo="BAR">`,
},
{
"Unquoted attribute value, trailing space",
`<input value=yes FOO=BAR >`,
`<input value="yes" foo="BAR">`,
},
{
"Single-quoted attribute value",
`<input value='yes' FOO='BAR'>`,
`<input value="yes" foo="BAR">`,
},
{
"Single-quoted attribute value, trailing space",
`<input value='yes' FOO='BAR' >`,
`<input value="yes" foo="BAR">`,
},
{
"Double-quoted attribute value",
`<input value="I'm an attribute" FOO="BAR">`,
`<input value="I&#39;m an attribute" foo="BAR">`,
},
{
"Attribute name characters",
`<meta http-equiv="content-type">`,
`<meta http-equiv="content-type">`,
},
{
"Mixed attributes",
`a<P V="0 1" w='2' X=3 y>z`,
`a$<p v="0 1" w="2" x="3" y="">$z`,
},
{
"Attributes with a solitary single quote",
`<p id=can't><p id=won't>`,
`<p id="can&#39;t">$<p id="won&#39;t">`,
},
}
func TestTokenizer(t *testing.T) {
loop:
for _, tt := range tokenTests {
z := NewTokenizer(strings.NewReader(tt.html))
if tt.golden != "" {
for i, s := range strings.Split(tt.golden, "$") {
if z.Next() == ErrorToken {
t.Errorf("%s token %d: want %q got error %v", tt.desc, i, s, z.Err())
continue loop
}
actual := z.Token().String()
if s != actual {
t.Errorf("%s token %d: want %q got %q", tt.desc, i, s, actual)
continue loop
}
}
}
z.Next()
if z.Err() != io.EOF {
t.Errorf("%s: want EOF got %q", tt.desc, z.Err())
}
}
}
func TestMaxBuffer(t *testing.T) {
// Exceeding the maximum buffer size generates ErrBufferExceeded.
z := NewTokenizer(strings.NewReader("<" + strings.Repeat("t", 10)))
z.SetMaxBuf(5)
tt := z.Next()
if got, want := tt, ErrorToken; got != want {
t.Fatalf("token type: got: %v want: %v", got, want)
}
if got, want := z.Err(), ErrBufferExceeded; got != want {
t.Errorf("error type: got: %v want: %v", got, want)
}
if got, want := string(z.Raw()), "<tttt"; got != want {
t.Fatalf("buffered before overflow: got: %q want: %q", got, want)
}
}
func TestMaxBufferReconstruction(t *testing.T) {
// Exceeding the maximum buffer size at any point while tokenizing permits
// reconstructing the original input.
tests:
for _, test := range tokenTests {
for maxBuf := 1; ; maxBuf++ {
r := strings.NewReader(test.html)
z := NewTokenizer(r)
z.SetMaxBuf(maxBuf)
var tokenized bytes.Buffer
for {
tt := z.Next()
tokenized.Write(z.Raw())
if tt == ErrorToken {
if err := z.Err(); err != io.EOF && err != ErrBufferExceeded {
t.Errorf("%s: unexpected error: %v", test.desc, err)
}
break
}
}
// Anything tokenized along with untokenized input or data left in the reader.
assembled, err := ioutil.ReadAll(io.MultiReader(&tokenized, bytes.NewReader(z.Buffered()), r))
if err != nil {
t.Errorf("%s: ReadAll: %v", test.desc, err)
continue tests
}
if got, want := string(assembled), test.html; got != want {
t.Errorf("%s: reassembled html:\n got: %q\nwant: %q", test.desc, got, want)
continue tests
}
// EOF indicates that we completed tokenization and hence found the max
// maxBuf that generates ErrBufferExceeded, so continue to the next test.
if z.Err() == io.EOF {
break
}
} // buffer sizes
} // tests
}
func TestPassthrough(t *testing.T) {
// Accumulating the raw output for each parse event should reconstruct the
// original input.
for _, test := range tokenTests {
z := NewTokenizer(strings.NewReader(test.html))
var parsed bytes.Buffer
for {
tt := z.Next()
parsed.Write(z.Raw())
if tt == ErrorToken {
break
}
}
if got, want := parsed.String(), test.html; got != want {
t.Errorf("%s: parsed output:\n got: %q\nwant: %q", test.desc, got, want)
}
}
}
func TestBufAPI(t *testing.T) {
s := "0<a>1</a>2<b>3<a>4<a>5</a>6</b>7</a>8<a/>9"
z := NewTokenizer(bytes.NewBufferString(s))
var result bytes.Buffer
depth := 0
loop:
for {
tt := z.Next()
switch tt {
case ErrorToken:
if z.Err() != io.EOF {
t.Error(z.Err())
}
break loop
case TextToken:
if depth > 0 {
result.Write(z.Text())
}
case StartTagToken, EndTagToken:
tn, _ := z.TagName()
if len(tn) == 1 && tn[0] == 'a' {
if tt == StartTagToken {
depth++
} else {
depth--
}
}
}
}
u := "14567"
v := string(result.Bytes())
if u != v {
t.Errorf("TestBufAPI: want %q got %q", u, v)
}
}
func TestConvertNewlines(t *testing.T) {
testCases := map[string]string{
"Mac\rDOS\r\nUnix\n": "Mac\nDOS\nUnix\n",
"Unix\nMac\rDOS\r\n": "Unix\nMac\nDOS\n",
"DOS\r\nDOS\r\nDOS\r\n": "DOS\nDOS\nDOS\n",
"": "",
"\n": "\n",
"\n\r": "\n\n",
"\r": "\n",
"\r\n": "\n",
"\r\n\n": "\n\n",
"\r\n\r": "\n\n",
"\r\n\r\n": "\n\n",
"\r\r": "\n\n",
"\r\r\n": "\n\n",
"\r\r\n\n": "\n\n\n",
"\r\r\r\n": "\n\n\n",
"\r \n": "\n \n",
"xyz": "xyz",
}
for in, want := range testCases {
if got := string(convertNewlines([]byte(in))); got != want {
t.Errorf("input %q: got %q, want %q", in, got, want)
}
}
}
func TestReaderEdgeCases(t *testing.T) {
const s = "<p>An io.Reader can return (0, nil) or (n, io.EOF).</p>"
testCases := []io.Reader{
&zeroOneByteReader{s: s},
&eofStringsReader{s: s},
&stuckReader{},
}
for i, tc := range testCases {
got := []TokenType{}
z := NewTokenizer(tc)
for {
tt := z.Next()
if tt == ErrorToken {
break
}
got = append(got, tt)
}
if err := z.Err(); err != nil && err != io.EOF {
if err != io.ErrNoProgress {
t.Errorf("i=%d: %v", i, err)
}
continue
}
want := []TokenType{
StartTagToken,
TextToken,
EndTagToken,
}
if !reflect.DeepEqual(got, want) {
t.Errorf("i=%d: got %v, want %v", i, got, want)
continue
}
}
}
// zeroOneByteReader is like a strings.Reader that alternates between
// returning 0 bytes and 1 byte at a time.
type zeroOneByteReader struct {
s string
n int
}
func (r *zeroOneByteReader) Read(p []byte) (int, error) {
if len(p) == 0 {
return 0, nil
}
if len(r.s) == 0 {
return 0, io.EOF
}
r.n++
if r.n%2 != 0 {
return 0, nil
}
p[0], r.s = r.s[0], r.s[1:]
return 1, nil
}
// eofStringsReader is like a strings.Reader but can return an (n, err) where
// n > 0 && err != nil.
type eofStringsReader struct {
s string
}
func (r *eofStringsReader) Read(p []byte) (int, error) {
n := copy(p, r.s)
r.s = r.s[n:]
if r.s != "" {
return n, nil
}
return n, io.EOF
}
// stuckReader is an io.Reader that always returns no data and no error.
type stuckReader struct{}
func (*stuckReader) Read(p []byte) (int, error) {
return 0, nil
}
const (
rawLevel = iota
lowLevel
highLevel
)
func benchmarkTokenizer(b *testing.B, level int) {
buf, err := ioutil.ReadFile("testdata/go1.html")
if err != nil {
b.Fatalf("could not read testdata/go1.html: %v", err)
}
b.SetBytes(int64(len(buf)))
runtime.GC()
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
z := NewTokenizer(bytes.NewBuffer(buf))
for {
tt := z.Next()
if tt == ErrorToken {
if err := z.Err(); err != nil && err != io.EOF {
b.Fatalf("tokenizer error: %v", err)
}
break
}
switch level {
case rawLevel:
// Calling z.Raw just returns the raw bytes of the token. It does
// not unescape &lt; to <, or lower-case tag names and attribute keys.
z.Raw()
case lowLevel:
// Caling z.Text, z.TagName and z.TagAttr returns []byte values
// whose contents may change on the next call to z.Next.
switch tt {
case TextToken, CommentToken, DoctypeToken:
z.Text()
case StartTagToken, SelfClosingTagToken:
_, more := z.TagName()
for more {
_, _, more = z.TagAttr()
}
case EndTagToken:
z.TagName()
}
case highLevel:
// Calling z.Token converts []byte values to strings whose validity
// extend beyond the next call to z.Next.
z.Token()
}
}
}
}
func BenchmarkRawLevelTokenizer(b *testing.B) { benchmarkTokenizer(b, rawLevel) }
func BenchmarkLowLevelTokenizer(b *testing.B) { benchmarkTokenizer(b, lowLevel) }
func BenchmarkHighLevelTokenizer(b *testing.B) { benchmarkTokenizer(b, highLevel) }

View File

@@ -9,6 +9,7 @@
package transform
import (
"bytes"
"errors"
"io"
"unicode/utf8"
@@ -127,7 +128,7 @@ func (r *Reader) Read(p []byte) (int, error) {
// cannot read more bytes into src.
r.transformComplete = r.err != nil
continue
case err == ErrShortDst && r.dst1 != 0:
case err == ErrShortDst && (r.dst1 != 0 || n != 0):
// Make room in dst by copying out, and try again.
continue
case err == ErrShortSrc && r.src1-r.src0 != len(r.src) && r.err == nil:
@@ -210,7 +211,7 @@ func (w *Writer) Write(data []byte) (n int, err error) {
n += nSrc
}
switch {
case err == ErrShortDst && nDst > 0:
case err == ErrShortDst && (nDst > 0 || nSrc > 0):
case err == ErrShortSrc && len(src) < len(w.src):
m := copy(w.src, src)
// If w.n > 0, bytes from data were already copied to w.src and n
@@ -467,30 +468,125 @@ func (t removeF) Transform(dst, src []byte, atEOF bool) (nDst, nSrc int, err err
return
}
// Bytes returns a new byte slice with the result of converting b using t.
// If any unrecoverable error occurs it returns nil.
func Bytes(t Transformer, b []byte) []byte {
out := make([]byte, len(b))
n := 0
for {
nDst, nSrc, err := t.Transform(out[n:], b, true)
n += nDst
if err == nil {
return out[:n]
} else if err != ErrShortDst {
return nil
}
b = b[nSrc:]
// grow returns a new []byte that is longer than b, and copies the first n bytes
// of b to the start of the new slice.
func grow(b []byte, n int) []byte {
m := len(b)
if m <= 256 {
m *= 2
} else {
m += m >> 1
}
buf := make([]byte, m)
copy(buf, b[:n])
return buf
}
// Grow the destination buffer.
sz := len(out)
if sz <= 256 {
sz *= 2
} else {
sz += sz >> 1
const initialBufSize = 128
// String returns a string with the result of converting s[:n] using t, where
// n <= len(s). If err == nil, n will be len(s).
func String(t Transformer, s string) (result string, n int, err error) {
if s == "" {
return "", 0, nil
}
// Allocate only once. Note that both dst and src escape when passed to
// Transform.
buf := [2 * initialBufSize]byte{}
dst := buf[:initialBufSize:initialBufSize]
src := buf[initialBufSize : 2*initialBufSize]
// Avoid allocation if the transformed string is identical to the original.
// After this loop, pDst will point to the furthest point in s for which it
// could be detected that t gives equal results, src[:nSrc] will
// indicated the last processed chunk of s for which the output is not equal
// and dst[:nDst] will be the transform of this chunk.
var nDst, nSrc int
pDst := 0 // Used as index in both src and dst in this loop.
for {
n := copy(src, s[pDst:])
nDst, nSrc, err = t.Transform(dst, src[:n], pDst+n == len(s))
// Note 1: we will not enter the loop with pDst == len(s) and we will
// not end the loop with it either. So if nSrc is 0, this means there is
// some kind of error from which we cannot recover given the current
// buffer sizes. We will give up in this case.
// Note 2: it is not entirely correct to simply do a bytes.Equal as
// a Transformer may buffer internally. It will work in most cases,
// though, and no harm is done if it doesn't work.
// TODO: let transformers implement an optional Spanner interface, akin
// to norm's QuickSpan. This would even allow us to avoid any allocation.
if nSrc == 0 || !bytes.Equal(dst[:nDst], src[:nSrc]) {
break
}
if pDst += nDst; pDst == len(s) {
return s, pDst, nil
}
}
// Move the bytes seen so far to dst.
pSrc := pDst + nSrc
if pDst+nDst <= initialBufSize {
copy(dst[pDst:], dst[:nDst])
} else {
b := make([]byte, len(s)+nDst-nSrc)
copy(b[pDst:], dst[:nDst])
dst = b
}
copy(dst, s[:pDst])
pDst += nDst
if err != nil && err != ErrShortDst && err != ErrShortSrc {
return string(dst[:pDst]), pSrc, err
}
// Complete the string with the remainder.
for {
n := copy(src, s[pSrc:])
nDst, nSrc, err = t.Transform(dst[pDst:], src[:n], pSrc+n == len(s))
pDst += nDst
pSrc += nSrc
switch err {
case nil:
if pSrc == len(s) {
return string(dst[:pDst]), pSrc, nil
}
case ErrShortDst:
// Do not grow as long as we can make progress. This may avoid
// excessive allocations.
if nDst == 0 {
dst = grow(dst, pDst)
}
case ErrShortSrc:
if nSrc == 0 {
src = grow(src, 0)
}
default:
return string(dst[:pDst]), pSrc, err
}
}
}
// Bytes returns a new byte slice with the result of converting b[:n] using t,
// where n <= len(b). If err == nil, n will be len(b).
func Bytes(t Transformer, b []byte) (result []byte, n int, err error) {
dst := make([]byte, len(b))
pDst, pSrc := 0, 0
for {
nDst, nSrc, err := t.Transform(dst[pDst:], b[pSrc:], true)
pDst += nDst
pSrc += nSrc
if err != ErrShortDst {
return dst[:pDst], pSrc, err
}
// Grow the destination buffer, but do not grow as long as we can make
// progress. This may avoid excessive allocations.
if nDst == 0 {
dst = grow(dst, pDst)
}
out2 := make([]byte, sz)
copy(out2, out[:n])
out = out2
}
}

View File

@@ -12,6 +12,7 @@ import (
"strconv"
"strings"
"testing"
"time"
"unicode/utf8"
)
@@ -132,6 +133,43 @@ func (e rleEncode) Transform(dst, src []byte, atEOF bool) (nDst, nSrc int, err e
return nDst, nSrc, nil
}
// trickler consumes all input bytes, but writes a single byte at a time to dst.
type trickler []byte
func (t *trickler) Transform(dst, src []byte, atEOF bool) (nDst, nSrc int, err error) {
*t = append(*t, src...)
if len(*t) == 0 {
return 0, 0, nil
}
if len(dst) == 0 {
return 0, len(src), ErrShortDst
}
dst[0] = (*t)[0]
*t = (*t)[1:]
if len(*t) > 0 {
err = ErrShortDst
}
return 1, len(src), err
}
// delayedTrickler is like trickler, but delays writing output to dst. This is
// highly unlikely to be relevant in practice, but it seems like a good idea
// to have some tolerance as long as progress can be detected.
type delayedTrickler []byte
func (t *delayedTrickler) Transform(dst, src []byte, atEOF bool) (nDst, nSrc int, err error) {
if len(*t) > 0 && len(dst) > 0 {
dst[0] = (*t)[0]
*t = (*t)[1:]
nDst = 1
}
*t = append(*t, src...)
if len(*t) > 0 {
err = ErrShortDst
}
return nDst, len(src), err
}
type testCase struct {
desc string
t Transformer
@@ -170,6 +208,15 @@ func (c chain) String() string {
}
var testCases = []testCase{
{
desc: "empty",
t: lowerCaseASCII{},
src: "",
dstSize: 100,
srcSize: 100,
wantStr: "",
},
{
desc: "basic",
t: lowerCaseASCII{},
@@ -378,6 +425,24 @@ var testCases = []testCase{
ioSize: 10,
wantStr: "4a6b2b4c4d1d",
},
{
desc: "trickler",
t: &trickler{},
src: "abcdefghijklm",
dstSize: 3,
srcSize: 15,
wantStr: "abcdefghijklm",
},
{
desc: "delayedTrickler",
t: &delayedTrickler{},
src: "abcdefghijklm",
dstSize: 3,
srcSize: 15,
wantStr: "abcdefghijklm",
},
}
func TestReader(t *testing.T) {
@@ -685,7 +750,7 @@ func doTransform(tc testCase) (res string, iter int, err error) {
switch {
case err == nil && len(in) != 0:
case err == ErrShortSrc && nSrc > 0:
case err == ErrShortDst && nDst > 0:
case err == ErrShortDst && (nDst > 0 || nSrc > 0):
default:
return string(out), iter, err
}
@@ -875,27 +940,136 @@ func TestRemoveFunc(t *testing.T) {
}
}
func TestBytes(t *testing.T) {
func testString(t *testing.T, f func(Transformer, string) (string, int, error)) {
for _, tt := range append(testCases, chainTests()...) {
if tt.desc == "allowStutter = true" {
// We don't have control over the buffer size, so we eliminate tests
// that depend on a specific buffer size being set.
continue
}
got := Bytes(tt.t, []byte(tt.src))
if tt.wantErr != nil {
if tt.wantErr != ErrShortDst && tt.wantErr != ErrShortSrc {
// Bytes should return nil for non-recoverable errors.
if g, w := (got == nil), (tt.wantErr != nil); g != w {
t.Errorf("%s:error: got %v; want %v", tt.desc, g, w)
}
}
// The output strings in the tests that expect an error will
// almost certainly not be the same as the result of Bytes.
reset(tt.t)
if tt.wantErr == ErrShortDst || tt.wantErr == ErrShortSrc {
// The result string will be different.
continue
}
if string(got) != tt.wantStr {
got, n, err := f(tt.t, tt.src)
if tt.wantErr != err {
t.Errorf("%s:error: got %v; want %v", tt.desc, err, tt.wantErr)
}
if got, want := err == nil, n == len(tt.src); got != want {
t.Errorf("%s:n: got %v; want %v", tt.desc, got, want)
}
if got != tt.wantStr {
t.Errorf("%s:string: got %q; want %q", tt.desc, got, tt.wantStr)
}
}
}
func TestBytes(t *testing.T) {
testString(t, func(z Transformer, s string) (string, int, error) {
b, n, err := Bytes(z, []byte(s))
return string(b), n, err
})
}
func TestString(t *testing.T) {
testString(t, String)
// Overrun the internal destination buffer.
for i, s := range []string{
strings.Repeat("a", initialBufSize-1),
strings.Repeat("a", initialBufSize+0),
strings.Repeat("a", initialBufSize+1),
strings.Repeat("A", initialBufSize-1),
strings.Repeat("A", initialBufSize+0),
strings.Repeat("A", initialBufSize+1),
strings.Repeat("A", 2*initialBufSize-1),
strings.Repeat("A", 2*initialBufSize+0),
strings.Repeat("A", 2*initialBufSize+1),
strings.Repeat("a", initialBufSize-2) + "A",
strings.Repeat("a", initialBufSize-1) + "A",
strings.Repeat("a", initialBufSize+0) + "A",
strings.Repeat("a", initialBufSize+1) + "A",
} {
got, _, _ := String(lowerCaseASCII{}, s)
if want := strings.ToLower(s); got != want {
t.Errorf("%d:dst buffer test: got %s (%d); want %s (%d)", i, got, len(got), want, len(want))
}
}
// Overrun the internal source buffer.
for i, s := range []string{
strings.Repeat("a", initialBufSize-1),
strings.Repeat("a", initialBufSize+0),
strings.Repeat("a", initialBufSize+1),
strings.Repeat("a", 2*initialBufSize+1),
strings.Repeat("a", 2*initialBufSize+0),
strings.Repeat("a", 2*initialBufSize+1),
} {
got, _, _ := String(rleEncode{}, s)
if want := fmt.Sprintf("%da", len(s)); got != want {
t.Errorf("%d:src buffer test: got %s (%d); want %s (%d)", i, got, len(got), want, len(want))
}
}
// Test allocations for non-changing strings.
// Note we still need to allocate a single buffer.
for i, s := range []string{
"",
"123",
"123456789",
strings.Repeat("a", initialBufSize),
strings.Repeat("a", 10*initialBufSize),
} {
if n := testing.AllocsPerRun(5, func() { String(&lowerCaseASCII{}, s) }); n > 1 {
t.Errorf("%d: #allocs was %f; want 1", i, n)
}
}
}
// TestBytesAllocation tests that buffer growth stays limited with the trickler
// transformer, which behaves oddly but within spec. In case buffer growth is
// not correctly handled, the test will either panic with a failed allocation or
// thrash. To ensure the tests terminate under the last condition, we time out
// after some sufficiently long period of time.
func TestBytesAllocation(t *testing.T) {
done := make(chan bool)
go func() {
in := bytes.Repeat([]byte{'a'}, 1000)
tr := trickler(make([]byte, 1))
Bytes(&tr, in)
done <- true
}()
select {
case <-done:
case <-time.After(3 * time.Second):
t.Error("time out, likely due to excessive allocation")
}
}
// TestStringAllocation tests that buffer growth stays limited with the trickler
// transformer, which behaves oddly but within spec. In case buffer growth is
// not correctly handled, the test will either panic with a failed allocation or
// thrash. To ensure the tests terminate under the last condition, we time out
// after some sufficiently long period of time.
func TestStringAllocation(t *testing.T) {
done := make(chan bool)
go func() {
in := strings.Repeat("a", 1000)
tr := trickler(make([]byte, 1))
String(&tr, in)
done <- true
}()
select {
case <-done:
case <-time.After(3 * time.Second):
t.Error("time out, likely due to excessive allocation")
}
}
func BenchmarkStringLower(b *testing.B) {
in := strings.Repeat("a", 4096)
for i := 0; i < b.N; i++ {
String(&lowerCaseASCII{}, in)
}
}

View File

@@ -0,0 +1,124 @@
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package snappy
import (
"encoding/binary"
"errors"
)
// ErrCorrupt reports that the input is invalid.
var ErrCorrupt = errors.New("snappy: corrupt input")
// DecodedLen returns the length of the decoded block.
func DecodedLen(src []byte) (int, error) {
v, _, err := decodedLen(src)
return v, err
}
// decodedLen returns the length of the decoded block and the number of bytes
// that the length header occupied.
func decodedLen(src []byte) (blockLen, headerLen int, err error) {
v, n := binary.Uvarint(src)
if n == 0 {
return 0, 0, ErrCorrupt
}
if uint64(int(v)) != v {
return 0, 0, errors.New("snappy: decoded block is too large")
}
return int(v), n, nil
}
// Decode returns the decoded form of src. The returned slice may be a sub-
// slice of dst if dst was large enough to hold the entire decoded block.
// Otherwise, a newly allocated slice will be returned.
// It is valid to pass a nil dst.
func Decode(dst, src []byte) ([]byte, error) {
dLen, s, err := decodedLen(src)
if err != nil {
return nil, err
}
if len(dst) < dLen {
dst = make([]byte, dLen)
}
var d, offset, length int
for s < len(src) {
switch src[s] & 0x03 {
case tagLiteral:
x := uint(src[s] >> 2)
switch {
case x < 60:
s += 1
case x == 60:
s += 2
if s > len(src) {
return nil, ErrCorrupt
}
x = uint(src[s-1])
case x == 61:
s += 3
if s > len(src) {
return nil, ErrCorrupt
}
x = uint(src[s-2]) | uint(src[s-1])<<8
case x == 62:
s += 4
if s > len(src) {
return nil, ErrCorrupt
}
x = uint(src[s-3]) | uint(src[s-2])<<8 | uint(src[s-1])<<16
case x == 63:
s += 5
if s > len(src) {
return nil, ErrCorrupt
}
x = uint(src[s-4]) | uint(src[s-3])<<8 | uint(src[s-2])<<16 | uint(src[s-1])<<24
}
length = int(x + 1)
if length <= 0 {
return nil, errors.New("snappy: unsupported literal length")
}
if length > len(dst)-d || length > len(src)-s {
return nil, ErrCorrupt
}
copy(dst[d:], src[s:s+length])
d += length
s += length
continue
case tagCopy1:
s += 2
if s > len(src) {
return nil, ErrCorrupt
}
length = 4 + int(src[s-2])>>2&0x7
offset = int(src[s-2])&0xe0<<3 | int(src[s-1])
case tagCopy2:
s += 3
if s > len(src) {
return nil, ErrCorrupt
}
length = 1 + int(src[s-3])>>2
offset = int(src[s-2]) | int(src[s-1])<<8
case tagCopy4:
return nil, errors.New("snappy: unsupported COPY_4 tag")
}
end := d + length
if offset > d || end > len(dst) {
return nil, ErrCorrupt
}
for ; d < end; d++ {
dst[d] = dst[d-offset]
}
}
if d != dLen {
return nil, ErrCorrupt
}
return dst[:d], nil
}

View File

@@ -0,0 +1,174 @@
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package snappy
import (
"encoding/binary"
)
// We limit how far copy back-references can go, the same as the C++ code.
const maxOffset = 1 << 15
// emitLiteral writes a literal chunk and returns the number of bytes written.
func emitLiteral(dst, lit []byte) int {
i, n := 0, uint(len(lit)-1)
switch {
case n < 60:
dst[0] = uint8(n)<<2 | tagLiteral
i = 1
case n < 1<<8:
dst[0] = 60<<2 | tagLiteral
dst[1] = uint8(n)
i = 2
case n < 1<<16:
dst[0] = 61<<2 | tagLiteral
dst[1] = uint8(n)
dst[2] = uint8(n >> 8)
i = 3
case n < 1<<24:
dst[0] = 62<<2 | tagLiteral
dst[1] = uint8(n)
dst[2] = uint8(n >> 8)
dst[3] = uint8(n >> 16)
i = 4
case int64(n) < 1<<32:
dst[0] = 63<<2 | tagLiteral
dst[1] = uint8(n)
dst[2] = uint8(n >> 8)
dst[3] = uint8(n >> 16)
dst[4] = uint8(n >> 24)
i = 5
default:
panic("snappy: source buffer is too long")
}
if copy(dst[i:], lit) != len(lit) {
panic("snappy: destination buffer is too short")
}
return i + len(lit)
}
// emitCopy writes a copy chunk and returns the number of bytes written.
func emitCopy(dst []byte, offset, length int) int {
i := 0
for length > 0 {
x := length - 4
if 0 <= x && x < 1<<3 && offset < 1<<11 {
dst[i+0] = uint8(offset>>8)&0x07<<5 | uint8(x)<<2 | tagCopy1
dst[i+1] = uint8(offset)
i += 2
break
}
x = length
if x > 1<<6 {
x = 1 << 6
}
dst[i+0] = uint8(x-1)<<2 | tagCopy2
dst[i+1] = uint8(offset)
dst[i+2] = uint8(offset >> 8)
i += 3
length -= x
}
return i
}
// Encode returns the encoded form of src. The returned slice may be a sub-
// slice of dst if dst was large enough to hold the entire encoded block.
// Otherwise, a newly allocated slice will be returned.
// It is valid to pass a nil dst.
func Encode(dst, src []byte) ([]byte, error) {
if n := MaxEncodedLen(len(src)); len(dst) < n {
dst = make([]byte, n)
}
// The block starts with the varint-encoded length of the decompressed bytes.
d := binary.PutUvarint(dst, uint64(len(src)))
// Return early if src is short.
if len(src) <= 4 {
if len(src) != 0 {
d += emitLiteral(dst[d:], src)
}
return dst[:d], nil
}
// Initialize the hash table. Its size ranges from 1<<8 to 1<<14 inclusive.
const maxTableSize = 1 << 14
shift, tableSize := uint(32-8), 1<<8
for tableSize < maxTableSize && tableSize < len(src) {
shift--
tableSize *= 2
}
var table [maxTableSize]int
// Iterate over the source bytes.
var (
s int // The iterator position.
t int // The last position with the same hash as s.
lit int // The start position of any pending literal bytes.
)
for s+3 < len(src) {
// Update the hash table.
b0, b1, b2, b3 := src[s], src[s+1], src[s+2], src[s+3]
h := uint32(b0) | uint32(b1)<<8 | uint32(b2)<<16 | uint32(b3)<<24
p := &table[(h*0x1e35a7bd)>>shift]
// We need to to store values in [-1, inf) in table. To save
// some initialization time, (re)use the table's zero value
// and shift the values against this zero: add 1 on writes,
// subtract 1 on reads.
t, *p = *p-1, s+1
// If t is invalid or src[s:s+4] differs from src[t:t+4], accumulate a literal byte.
if t < 0 || s-t >= maxOffset || b0 != src[t] || b1 != src[t+1] || b2 != src[t+2] || b3 != src[t+3] {
s++
continue
}
// Otherwise, we have a match. First, emit any pending literal bytes.
if lit != s {
d += emitLiteral(dst[d:], src[lit:s])
}
// Extend the match to be as long as possible.
s0 := s
s, t = s+4, t+4
for s < len(src) && src[s] == src[t] {
s++
t++
}
// Emit the copied bytes.
d += emitCopy(dst[d:], s-t, s-s0)
lit = s
}
// Emit any final pending literal bytes and return.
if lit != len(src) {
d += emitLiteral(dst[d:], src[lit:])
}
return dst[:d], nil
}
// MaxEncodedLen returns the maximum length of a snappy block, given its
// uncompressed length.
func MaxEncodedLen(srcLen int) int {
// Compressed data can be defined as:
// compressed := item* literal*
// item := literal* copy
//
// The trailing literal sequence has a space blowup of at most 62/60
// since a literal of length 60 needs one tag byte + one extra byte
// for length information.
//
// Item blowup is trickier to measure. Suppose the "copy" op copies
// 4 bytes of data. Because of a special check in the encoding code,
// we produce a 4-byte copy only if the offset is < 65536. Therefore
// the copy op takes 3 bytes to encode, and this type of item leads
// to at most the 62/60 blowup for representing literals.
//
// Suppose the "copy" op copies 5 bytes of data. If the offset is big
// enough, it will take 5 bytes to encode the copy op. Therefore the
// worst case here is a one-byte literal followed by a five-byte copy.
// That is, 6 bytes of input turn into 7 bytes of "compressed" data.
//
// This last factor dominates the blowup, so the final estimate is:
return 32 + srcLen + srcLen/6
}

View File

@@ -0,0 +1,38 @@
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Package snappy implements the snappy block-based compression format.
// It aims for very high speeds and reasonable compression.
//
// The C++ snappy implementation is at http://code.google.com/p/snappy/
package snappy
/*
Each encoded block begins with the varint-encoded length of the decoded data,
followed by a sequence of chunks. Chunks begin and end on byte boundaries. The
first byte of each chunk is broken into its 2 least and 6 most significant bits
called l and m: l ranges in [0, 4) and m ranges in [0, 64). l is the chunk tag.
Zero means a literal tag. All other values mean a copy tag.
For literal tags:
- If m < 60, the next 1 + m bytes are literal bytes.
- Otherwise, let n be the little-endian unsigned integer denoted by the next
m - 59 bytes. The next 1 + n bytes after that are literal bytes.
For copy tags, length bytes are copied from offset bytes ago, in the style of
Lempel-Ziv compression algorithms. In particular:
- For l == 1, the offset ranges in [0, 1<<11) and the length in [4, 12).
The length is 4 + the low 3 bits of m. The high 3 bits of m form bits 8-10
of the offset. The next byte is bits 0-7 of the offset.
- For l == 2, the offset ranges in [0, 1<<16) and the length in [1, 65).
The length is 1 + m. The offset is the little-endian unsigned integer
denoted by the next 2 bytes.
- For l == 3, this tag is a legacy format that is no longer supported.
*/
const (
tagLiteral = 0x00
tagCopy1 = 0x01
tagCopy2 = 0x02
tagCopy4 = 0x03
)

Some files were not shown because too many files have changed in this diff Show More