Difference between revisions of "Developmental Robotics and Neural Networks"
(→Research Done Thus Far)
|Line 202:||Line 202:|
# Test of Elman-style XOR in time.
# Test of Elman-style XOR in time.
|Line 296:||Line 298:|
Revision as of 17:19, 5 August 2009
Welcome, ye Computer Science enthusiasts, to Meena's 2009 Summer Research Wiki.
- 1 Abstract (As of 19 June 2009)
- 2 Research Done Thus Far
- 3 Conclusions Drawn Thus Far
- 4 Media
- 5 References/Useful Links
Abstract (As of 19 June 2009)
Meena SeralathanMentor: Doug Blank
Developmental robotics is a interdisciplinary field of study working towards understanding the human mind, and emulating such complex processes in mathematical computations done by a computer chip. In doing so developmental roboticists must give the robot's mind the ability to learn in real-time, on its own, and to develop its own goals and motivations in time based on what it has learned. Many existing artificial neural networks (ANN; mathematical representations of a human neural network) cannot handle real-time input, or have poor memory handling, causing the networks to forget what they have learned once the robot travels to a different environment, and to be unable to effectively process new input. Thus the purpose of this experiment is to explore the many different structures for ANNs, and to modify them to create a better learning system for robots. By improving the way the ANN retains memory, how it reacts to different stimuli, and how it makes generalizations and abstractions of its environment, we hope to work towards developing a network that can learn effectively as it explores its environment.
Research Done Thus Far
27 May 2009
- Discussed ideas in developmental robotics.
- Installed Pyjama under linux (Fedora) and ran into an issue with the return key.
- Tried some tests with the Robonova. First tried sending commands to the robot via the serial cable connected to the converter by using the code I wrote last year; Robonova moved without any problems. Then connected the fluke to the converter and tried to send information to the robot (results have been pasted here: File:WithConverter.txt). Sending information to the fluke was not a problem; getting the robot to receive anything was problematic. Then tried plugging fluke directly into the robot, but this prevented us from even connecting to the fluke via bluetooth. Looked at fluke code and it looks like it should be correctly sending information...
28 May 2009
Mostly spent today testing wire connections and the fluke and the serial cable.
- First started with the cable; plugged it directly in and commands worked. Unplugged ground; robot started spazzing and couldn't be controlled. Reset the robot a few times and plugged everything back in; then heard the beep I'm supposed to hear if the robot receives an incorrect byte ("error beep") when I opened the port. Then tried unplugging ground; was able to control robot again. Whenever I tried plugging in ground again I would get the beeps instead of actions. Whenever I unplugged ground it worked fine. Then tried connecting the cable to the TTL converter; it worked both with and without ground plugged in.
- Then started with the fluke. I first made sure I could connect to the fluke on its own and then to the fluke connected to the TTL converter (but not the robot). Worked fine, so then connected the fluke/converter to the robot with the ground cable plugged in; was still able to send messages to the fluke. Was also able to send messages without ground plugged in. Then tried without the converter (ground in); this was when the connection between the computer and the fluke started failing. Plugged the fluke in without any of the wires connected to the robot, and was able to send information to the fluke. Then plugged the TX wire into the robot when it was still off, and the connection between the computer and the fluke was lost. Same happens with RX wire. Then tried connecting the ground wire first, then switching the TX and RX cables; still had a connection, so turned the robot on. The connection was maintained, so then I tried sending bytes to the robot through the fluke (used init()). First time there seemed to be a problem with sending bytes (waited awhile and nothing happened; ctrl+c'd). Tried again, and started getting those error beeps again (two when I tried simply sending the letter for a move, three when I sent the "pass n bytes" command character plus the move letter, and then two when I sent the command character, the 1 character and the move letter. Then tried unplugging the ground to see what would happen; robot began beeping nonstop until I unplugged the fluke (and eventually the Rx wire). Once I only had the Tx wire connected, I was again able to send a byte and get two of the error beeps.
- Something else I noticed is that a pin layout I had from Drexel last year was different than the one I had found and had been using; I tried theirs and ended up with the same results for both the cable and the fluke (worked with the cable and the fluke made the robot beep). Not sure why that is.
- Want to try the fluke again while not using init(), and due to the last bullet I want to try using different pins to figure out how two different erx/etx layouts could both work with the serial cable.
- Copy of IDLE output here (File:28May2009.txt), Drexel layout here (File:Screenshot.png), layout I found here
29 May 2009
Tried some more fluke tests, making sure I set the baud rate between the fluke and robot each time; still no results. Doug suggested that the reason the fluke is losing the connection to the computer when plugged into the robot could be that it's going into "programing mode" (i.e., a mode where it's expecting to have stuff downloaded onto it as opposed to having stuff sent/passed through it). Only happens when the transmit wire's connected to the robot (not when the receive/ground wires are connected).
Also installed Pyjama on my computer, and was unable to open it directly; had to build it in Visual C# Express (learned how to download stuff through SVN in the process).
1 June 2009
Read a lot of papers about neural networks to get a better understanding between the math behind them, about how to make abstractions; started getting better acquainted with Pyro again by reading about how to use the interface and how robots/networks are programmed in it.
2 June 2009
3 June 2009
Discussed neural networks, wrote some networks in pyro (AND, OR, XOR, and one which takes a picture and determines whether or not a neon green alien bottle is in the picture or not).
4 June 2009
Wrote a neural network to recognise one of 4 shape categories (nothing = 0, circle = 1, triangle = 2, square = 3). It was able to recognize the shapes it was trained on, and could also recognize some of the same shapes in different (grayscaled) colors (circles and triangles gave it some trouble, though). Results here
For the second set, am going to try training only one sort of shape in multiple positions at a time, rather than three sorts of shapes in various positions.
8 June 2009
Was going to try changing my network to a cascade correlation network through Pyro, but the code seems to have changed quite a bit from the form presented on the website, and instead I decided to backtrack in order to learn more about the network and see if I can get a grasp of what has been implemented differently.
Started reading about reinforcement learning, Temporal Difference learning, the Monte Carlo method. Modified the RLBrain code in Pyro to be more likely to move to areas it hasn't visited as much yet.
While it definitely explored more, it had a less successful time finding the goal and continuing to find it in a relatively swift manner. Then I tried altering the random move percentage; changing it from 20% to 50% not only improved the movement of the robot (it didn't practically fill up the maze before running into a pit or the goal as at 20%), but it seemed to get more information about the pits more quickly. However it still did not travel to the goal as consistently as it had before. Changing the random percentage to 10% seemed to actually cause the robot to find the goal more often, but it seemed to have a much worse idea as to where everything was in the environment (it would run into the same pit from the same angle multiple times in a row before moving on; something that didn't happen at 50%). Negative reinforcement became trivial when the robot felt it had less options about where to move.
In general this sort of exploration seems best for creating a map of the area, but not for finding paths to (or away from) a destination, which was to be expected.
10 June 2009
Read about behaviors in Pyro; Cascade Correlation, studied the cascade correlation code a little more to see how it may have been altered since the Pyro website was created.
Also tried running shape tests on neural networks again, messing with hidden layer sizes to try and get networks that could recognize shapes in different positions better. This time I only used large, filled-in shapes (circle, square, triangle, nothing) to train the network, rather than filled-in large, filled-in small, and hollow. When trained to recognize whether an image has a circle (anywhere) in it or nothing at all, the network was best able to learn with only 9 hidden layers (it could distinguish the difference everytime; with networks around 5 layers or less, the network was nearly untrainable, and above 9 layers caused the network to mistake many of the circles for empty images).
When I increased the number of shapes (all four possibilities rather than just circles or nothing), the network seemed to have the least number of mistakes at around 5 hidden layers (below 5 did not work (1 layer caused the network to be unable to get more than 50% of the training shapes right, 4 layers couldn't get more than 75%), and the network got increasingly worse at guessing shapes during the test phase the more layers were added). Since changing the number of hidden layers didn't seem to help accuracy, will try changing the tolerance.
Training/testing results here.
11 June 2009
Ran network again, training network on more than just the shape in the middle of the image. Training took noticeably longer, as was expected, but the results were better; training on all the images, of course, allowed the network to get all the shapes correct, and training on half the images (the other half being similar but not exactly the same pictures) allowed it to get many (but not all of) the images. Results hither; Doug suggested also trying to have a gray buffer area between the black and white areas of the image in order to give the network more information about the shape; will try implementing this tomorrow.
12 June 2009
Ran the network a couple times because I was noticing that the network was taking drastically longer in some instances to train than in others. Also implemented the grey line area around the shapes, but the network took so long to train after this alteration that I can't tell if it learned better or not.
15 June 2009
Ventured to Swarthmore to learn about the robotics research students there were taking part in. Learned about the Rovio robot and discussed its pros and cons in relation to the Scribbler robots; our opinion as to whether or not they'd make a good replacement is still developing.
Saw that the network I had started on Friday had been unable to learn properly after 5000 epochs (about 34% correct by the final epoch), so didn't bother trying to test the network. Instead I reloaded my program and ran it again, and this time the network only took 48 epochs to train off 20 shapes, and got 6/40 shapes (20 of which were similar to the 20 it trained off of but weren't included in the training) wrong. This doesn't seem like an improvement from the original network. I remembered that the 5-hidden layer network was based on the concept of only using 4 training images and figure that changing the hidden layer (maybe back to 9?) will make training better; the issue of having to change the number of hidden layers to some specific number whenever I want to change the number of inputs is starting to get tedious, though, so I am going to go back to learning how cascade correlation networks are implemented in Pyro and see if I can figure out how the code's changed. (Results of Friday/Today here)
Update : Figured out how to not get errors while running cascor, but I don't think it's really making a cascor network anymore...
16 June 2009
Cascor network didn't really seem to be working correctly at all (seems like the network being made was simply a normal network with no hidden layers).
Doug gave me a network based on the Elman network (in which the network retains memory through a context layer). The idea is to get it to be able to learn XOR through sequential input, so that it will be able to see that certain values followed by certain other values should be followed by some output (ex. getting a 1 and then getting a 0 means that one should get 1 next).
The network seems to be able to do this for AND, but not OR or XOR. I will try tweaking the number of hidden layers and the epsilon, etc, and see if I can get OR to work, and if I can then use the weights from the OR network to train the network with XOR.
17 June 2009
There was an oceanography radio speaker today, so we went to listen to him speak about his career and his scientific background. It was very interesting.
Also read about RAVQ governors and read Fritzke's paper on growing neural gas. Am running yesterday's network again in a last-hope effort to get weights from it, and assuming that doesn't work I will try implementing a governor or trying the GNG approach and see if it can learn XOR.
18 June 2009
Learned the reason cascor wasn't working is because the Windows version of Pyro was 4.8 rather than 5.0, and the files for 5.0 did not work in Windows.
Got the cascade correlation network working in Fedora; the network was trained on 20 images (nothing, circle, triangle, square; black and white; bottom-left, bottom-right, center, top-left, top-right), and then tested on 80 (containing duplicates of the five positions, and an extra set of 10 shapes in blue and white). The network didn't require any hidden layers to get 100% in training, and guessed each picture correctly.
Fellow Swarthmore and Sarah Lawrence AI researchers stopped by to discuss a number of things, such as a more detailed look at GNGs, the Elman network, more about learning in real-time, etc. Deepak mentioned Reservoir Computing over lunch, and it does look like one of the implementations (the Echo State Network) would be a good improvement over the Elman network, should we be unable to get it to work.
Doug gave us some NSF interview questions to do. Also read a little more about the Echo State Network (found a paper about it), and am thinking about how to implement it.
Read the ESN paper, started looking at the conx code in more detail to think about how to create the ESN off what already exists.
Found the Elman paper in which he describes his XOR experiment. Tried his way of doing the experiment (having a series of bit fed in one at a time) by putting all the values in a list and having the network get one at a time (every third was the XOR'd value of the previous two). When trying this the network claimed to be able to learn in a single epoch, though this was proven false with testing (I gave the network three values one at a time and saw what the output was; the network more or less outputted the same value regardless of the pattern given as input).
Also tried experiments with changing how often the context layer was updated; while it slightly changed how the error changed over time, the change was not for better or worse.
Went back to the OR problem and ran tests, setting the outputs for value1 and the target to constants, and having the network train on value2 based on the OR value of value1 and value2. The network outputs the same value for each input (outputs the same value every time a 1 is given, etc); thus when the second value in the pattern is 1, the network will always output the value it has for 1, rather than a prediction for the next value.
Removed the constants and had the network try to predict everything, 5 hidden/context, 0.5 epsilon (everything else same); am noticing that I'm not getting the same dip in error every third pass, and I think this could be because Elman ran the same 3000-bit sequence through his network for his 600 passes, while Doug's code uses randomly generated patterns throughout training?
29 June - 3 July
29 June - 1 July
Started skimming through the Conx code to see how networks are implemented in it and to try and figure out whether it can easily be modified for an ESN/ to learn a bit more about how it all works underneath.
It's a very, very long file.
Doug suggested I try implementing the ESN from scratch, and Conx seems too long for me to just figure out everything from it, so am hunting the web for information on how to implement weighted graphs and the specific mathematics that goes into calculating weights, error, activations, etc.
Still on the hunt. Because of the fact that there's only 1% connectivity between reservoir nodes in the ESN I don't want to use a matrix to store weight values, but for the time being I'll use one because it seems like the simplest way to go about it. Having trouble finding the specific calculations networks use and the values that need to be kept during the process.
Finally found a great website for learning about neural network implementation (based on lecture notes), and have been using it to figure out how I'm going to use the network frame I have to calculate stuff. Will turn my network into a fully-connected one to make sure it works before trying the ESN.
Have a better understanding of the math behind networks, and have been reading up on graphs to figure out how I'm going to be implementing different parts of the calculations. Also went to an Ethics workshop, where we discussed various issues in science research.
Thinking about how to traverse the graph so I can go down paths and calculate the right values as I go along. Have been playing around with variations of DFS in order to see if I can calculate and update activations as the algorithm goes along, with slight success.
Went to UKC Humanoid conference to present the ESN algorithm and to learn what the rest of the PIRE team is doing in terms of humanoids.
An incoming freshman has joined the team, and is learning Myro/Scribbler stuff. Ashley and I helped her along with a couple CS/Python basics, and she is working through the textbook until she wants to move on.
Also discussed how to traverse the reservoir and get all the activations; decided it would be too much work to get completely updated activations at each timestep, and that it should be fine if they get updated at the next step. Outlined how I'm going to set the network up.
Have a Node class, Reservoir class, Bias class, Output class, Graph class, and am working on the code that will actually send stuff in and around the network.
July 27-31 (END)
Conclusions Drawn Thus Far
# Test of Elman-style XOR in time. from pyrobot.brain.conx import * low, high = 0.2, 0.8 def xor(a,b): """ XOR for floating point numbers """ if a < .5 and b < .5: return low if a > .5 and b > .5: return low return high def AND(a,b): """ XOR for floating point numbers """ if a > .5 or b > .5: return high return low def OR(a,b): if a < 0.5 and a < 0.5: return low return high def randVal(): """ Random 0 or 1, represented as 0.2 and 0.8, respectively. """ if random.random() < .5: return low else: return high if __name__ == '__main__': print "Sequential XOR modeled after Elman's experiment ..........." print "The network will see a random 1 or 0, followed by another" print "random 1 or 0. The target on the first number is 0.5, and " print "the target on the second is the XOR of the two numbers." n = Network() size = 8 n.addLayer("input", 1) n.addLayer("context", size) n.addLayer("hidden", size) n.addLayer("output", 1) n.connect("input", "hidden") n.connect("context", "hidden") n.connect("hidden", "output") n.setEpsilon(0.5) n.setMomentum(0.9) n.setBatch(0) n.setTolerance(.25) n.setReportRate(100) n.setLearning(1) n.setInteractive(0) lastContext = [.5] * size lastTarget = 0.5 count = 1 sweep = 1 correct_all = 0 total_all = 0 tss_all = 0.0 value1 = randVal() while True: value2 = randVal() #target = xor(value1, value2) #target = AND(value1, value2) target = OR(value1, value2) #lastContext = [0.5] * size n.step(input=[value1], context=lastContext, output=[value2]) lastContext = n["hidden"].getActivations() tss, correct, total, perr = n.step(input=[value2], context=lastContext, output=[target]) lastContext = n["hidden"].getActivations() value1 = randVal() n.step(input=[target], context=lastContext, output=[0.5])#value1]) lastTarget = target correct_all += correct tss_all += tss total_all += total if (count % n.reportRate) == 0: percentage = float(correct_all)/float(total_all) print "Epoch: %5d, steps: %5d, error: %7.3f, Correct: %3d%%" % \ (sweep, count, tss_all, int(percentage * 100)) if percentage > .9: break correct_all = 0 total_all = 0 tss_all = 0.0 sweep += 1 count += 1 print "Training complete." n.saveWeightsToFile("ElmanOR.txt") n.setInteractive(1) n.setLearning(0)