After rerunning the “missing_tree” links, I tallied the links up:
mysql> SELECT stanford, COUNT(stanford) FROM `hyperlinks_links` GROUP BY stanford;
+----------------------------------+-----------------+
| stanford | COUNT(stanford) |
+----------------------------------+-----------------+
| almost_constituent:-LRB- | 1 |
| almost_constituent:ADJP | 17 |
| almost_constituent:ADVP | 14 |
| almost_constituent:CC | 1 |
| almost_constituent:CD | 41 |
| almost_constituent:FRAG | 5 |
| almost_constituent:JJ | 57 |
| almost_constituent:NN | 58 |
| almost_constituent:NNP | 41 |
| almost_constituent:NNPS | 6 |
| almost_constituent:NNS | 43 |
| almost_constituent:NP | 283 |
| almost_constituent:PP | 8 |
| almost_constituent:PRT | 1 |
| almost_constituent:QP | 3 |
| almost_constituent:RB | 3 |
| almost_constituent:ROOT | 70 |
| almost_constituent:S | 15 |
| almost_constituent:SBAR | 16 |
| almost_constituent:SBARQ | 1 |
| almost_constituent:VBN | 1 |
| almost_constituent:VP | 31 |
| almost_constituent:X | 4 |
| almost_constituent_2ndpass_:NN | 1 |
| almost_constituent_2ndpass_:NNP | 1 |
| almost_constituent_2ndpass_:NP | 5 |
| almost_constituent_2ndpass_:PRP$ | 1 |
| constituent | 17435 |
| missing_tree | 1338 |
| multiple_constituents | 5014 |
| not_constituent | 5056 |
| unknown_error | 1309 |
| xclausal | 301 |
+----------------------------------+-----------------+
Variable | Value |
Almost constituents | 728 |
Constituents | 17435 |
Multiple constituents | 5014 |
Non-constituents | 5056 |
Cross-clausal links | 301 |
Links with missing trees | 1338 |
Links with unknown errors | 1309 |
Now, we tally up further and say that intended and actual constituents are the sum of almost-constituent, constituent, and multiple constituent links. This sum is 23177.
For ultimately non-constituents, this is the sum of the non-constituents and cross-clausal links: 5357.
The grand total of correctly parsed links is 28534.
So intended/actual constituents counted for 81.23% of the correctly parsed links, while non-constituents counted for 18.77% of the correctly parsed links.