Consistency

My goals yesterday from 10am to 3pm were to read over a text by Birgitta Bexten called Salience in Hypertext and begin looking over some code that Mitcho had already written. My ultimate goal is to begin scraping hyperlinks and text from Metafilter.com as soon as possible and begin determining the constituency of the hyperlinks using the Stanford and Berkeley parsers.

I got almost nothing from Bexten, except that she suggested that hyperlinks (at least in her examples) could be constituents and non-constituents. But it suggested pretty much nothing else.

I took a quick peek at the constituency-determining code and from a first glance (and from our prior meeting), I believe all it does is check for the same number of left and right parens (the parsers’ versions of square brackets used in bracket notation). And then I got confused as to why that was the only way, so I did some research and reading to refresh my definition of a constituent. Yeah, there are constituency tests that are based on grammaticality judgements, but simple code can’t do all of that. In Carnie’s Syntax: A Generative Introduction (2007), the final definition of a constituent is “A set of terminal nodes exhaustively dominated by a particular node”. So it means you can just do parens balancing to determine constituency.

That should mean that any single word (head) is a constituent as long as there aren’t any complements to it.

One thing I read in Bexten that confused me was on on page 15 where she has an example of a text and a hyperlink in German “Bei Simulationen handelt es sich um spezielle interaktive programme, die dynamische Modelle von Apparaten Prozessen und Systemen abbilden” which translates to “Simulations are special interactive programs which represent dynamic models of devices, processes and systems.”

According to Bexten, she claims that “It also occurs that … not a whole constituent is link-marked but only, e.g., an adjective.” That implies to me that the adjective isn’t a constituent.

This is what confused me and caused me to refresh my definition of a constituent. By throwing the English sentence (I’m making an assumption that DPs, AdjPs, and NPs work similarly in German, that may not even be necessary) into the Stanford, I got a tree where the adjective was very well a constituent (by parens balancing). I also hand traced the syntax tree for the DP “special interactive programs” and by Carnie’s definition, it is a constituent:

[DP
    [D'
        [D Ø]
        [NP
            [N'
                [AdjP
                    [Adj'
                        [Adj special]
                    ]
                ]
                [N'
                    [AdjP
                        [Adj'
                            [Adj interactive]
                        ]
                    ]
                    [N'
                        [N programs]
                    ]
                ]
            ]
        ]
    ]
]

It’s not a big deal, I just want to make sure my definition of a constituent is correct, because Bexten made it seem like the adjective wasn’t a constituent.

No related posts.

Related posts brought to you by Yet Another Related Posts Plugin.

One thought on “Consistency

  1. Carnie’s view is correct: constituency is defined purely in terms of dominance, and thus any single word should always be a constituent. I’m also not entirely sure what Bexten meant there.

    Note also that words are their own nodes in the tree as well, so even if our theory didn’t give us bar-levels or phrase-levels, for example, “special” above would still be a constituent by virtue of being dominated by itself.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>