Adobe Connect, GoPost, etc.
All meetings are linked from the course website’s schedule page. I’ll post them, usually by the morning following class.
Use all three:
Go to the FAQ first If you need clarification, use GoPost. If you still don’t get a good answer, e-mail the professor or the TA.
Patas, Condor
Is Patas the same as Dryas?
Yes, dryas.ling.washington.edu is a new machine, but it should work just like patas.ling.washington.edu. You can SSH into either in order to test your code. If Patas seems busy, try the other.
Why can’t I execute my Python script using ./mycode.py?
It needs to be an executable: chmod u+x mycode.py
How do I package my homework for submitting on CollectIt?
Please tar your assignment. To tar from the directory that your work is in:
$ tar -cvf hw2.tar * which produces hw2.tar. Please turn in this file.
So what’s the bottom line w. the cmd file for the howeworks?
If your process takes more than a few seconds to run, create a cmd file so that condor will distribute the processing load. Call the file, for example, hw1.cmd, hw2.cmd, etc.
Why bother with Condor?
The Condor system distributes the computation load among several processors. For small tasks, it really doesn’t matter, but when the cluster is really busy and you have a processor intensive job, the use of Condor makes best use of resources. See the CLMA Wiki for more info.
What’s the difference between condor-exec and condor_submit ? When do I need to use them?
Use condor-exec when testing your code, especially code that might take a long time to run. For condor-exec, you don’t need a cmd file, as it’s auto created for you. So, do this: condor-exec ./test.py, where test.py contains your main Python code with a shebang line at the top:#!/opt/python-2.6/bin/python2.6
Where’s the main documentation for Condor?
There’s a CLMA Wiki entry.
We’ll use condor_submit to run homeworks. You need to create a cmd file which gets executed by condor_submit like this: condor_submit mycmd.cmd That way, you can have more flexibility in how your code is structured and run, while allowing us the ability to run everyone’s homework the same way.
When do I just use python mycode.py ?
For testing of small Python scripts, just use python mycode.py
What do I do if Condor won’t execute my hw1.cmd file?
If your executable script does not have the right permissions set (e.g. u+x), then the Condor hw1.cmd file will not fire that script.
What do I do if I get a .log error:
Try creating a blank filename.log file and set permissions so your script can write to the file.
How do I manage (and remove) my jobs on Condor?
You can look at the queue by executing condor_q and remove jobs by condor_rm #id where you can find #id from the queue output. A job is held if the status (“ST”) is “H”.
How can I create a .cmd file using the condor-exec command?
run condor-exec as for example:
condor-exec python2.6 earley.py grammar.cfg sentences
Rename the file that was automatically created at step #1, that ends with three digits for example: condor_submit.000 to hw1.cmd.
Modify the LOG, output and error lines in hw1.cmd.
Original lines:
Getenv = true
Log = /tmp/$ENV(USER).condor.$(Cluster)
Universe = vanilla
executable = /opt/python-2.6/bin/python2.6
arguments = earley.py grammar.cfg sentences
output = python2.6.$(Cluster).$(Process).output
error = python2.6.$(Cluster).$(Process).error
Queue
New Lines:
Getenv = true
Log = hw1.log
Universe = vanilla
executable = /opt/python-2.6/bin/python2.6
arguments = earley.py grammar.cfg sentences
output = hw1.out
error = hw1.err
Queue
Grammars and Parsing
My parser’s tagging accuracy is low. How to improve it?
- Use a good POS tagger, e.g., an hmm tagger trained on the same training corpus as your parser.
- Simply guess NN for all unknown or low probability words.
How do I improve the runtime of a parser?
-Use hashtables widely and wisely.
-Use log probabilities (summing not multiplying)
- Divide your POS rules from Nonterminal rules in two separate structures. You don’t need to loop over POS rules (e.g., NN --> efficiency) when you know you’re looking for a Nonterminal.
- Limit the number of entries in each cell (if using a chart algorithm): e.g., try setting a probability threshold k (throw out entries w. a probability less than k).
- Use a smaller grammar, e.g., by manipulating Markovization factors when you transform to CNF
What is evalb?
- evalb is the de facto standard evaluation application used on all kinds of parsers. It implements the PARSEVAL metrics. It is highly configurable.
Are Bracketing Recall and Bracketing Precision in the evalb output the same as the Labeled Recall and Labeled Precision discussed in lecture?
- evalb incorporates the latter assumption: ie node labels matter. So, bracketing recall/precision=labeled recall/precision, but they didn’t say this in the README file. The tagging accuracy is the same as POS tagging accuracy.
Computational Semantics
Lambda calculus demo
In general, there are two ways to represent the verb see:
- In a relational style of predication (what most of you are used to): see(Jane,Mike)
- In a functional style of predication: ((see Mike) Jane) or shorthand (see Mike Jane)