Class: Informatics, Computing, and the Future
Instructor: Dan Berleant
Transcriber: Brooke Yu
Date: Tuesday, March 26, 2013
Speaker: In just a second you will see.... there you go. Ready to go.
Professor: Okay, we're going to hold off for another 6 or 7 minutes.
Professor: Did anyone not get a handout by the way? Any questions on the next homework?
So while we're waiting, how's the weather there?
Speaker: It's nice. There's a little snow left, but other than that it's been a nice day.
So are you from....
Professor: Hey, I had one copy of the introduction to the speaker. Did I give that to anybody?
Speaker: Would you like me to put the introduction on the screen?
Professor: Sure, go ahead and do that.
I've got it. Here it is.
Speaker: Okay, good enough.
Now, we don't have two cameras. Are you going to show us a picture of you on the side?
Speaker: No, I'm not. I thought about that just as I was starting, but no way. Got too much stuff on the screen as is.
I'm sure these students won't go into cognitive overload.
Speaker: I hope not. Although I did present this information to a colleague who said his head was going to explode.
Professor: Well, we're all rested. We just got back from spring break.
I will let you know that there is one chemistry major in the crowd.
Speaker: A heretic among the religious. Perhaps it will give him or her pause that they might end up this way themselves.
Professor: Hey, is the sound okay? Do you want it down a little? You don't care?
Speaker: Sorry, did you want me to put that on the screen?
No, we're all set.
Professor: Waiting for a couple more students to come in.
In some states students come to class early and in others they wander in late.
Speaker: I've never been in the former, but I'll take your word for it.
Why don't you talk, harry?
Speaker: What is it you'd like me to talk about?
I'm just adjusting the speakers.
Speaker: Very well.
We still have a couple more minutes.
Professor: So any questions on the handouts? There's two of them- a homework and a questionnaire from today. As promised, we have a guest speaker. He's a chemist named Harry Pence. He serves as a faculty member at his campus. He has written and presented frequently about emerging learning technologies. He's a co-editor along with Dr. Belford from our chemistry department of a book- enhancing learning with online resources, social media.
He'll be speaking to us through Skype today and afterwards we'll have answers.
Belford: if you have questions, he tried to write the slide numbers on his slides.
Speaker: The slide numbers didn't work, Bob.
Belford: ah, well figure out another way.
Professor: Thank you. Let's begin.
Speaker: Okay, thanks to my colleagues from little rock, and to your students who are attending this lecture. How did a chemist get into this? I've been doing this kind of teaching with technology for many years now. When I retired five years ago the director of teaching a learning technology center said "how would you like to be a faculty member for our Department?" I asked how much they paid, and they said nothing. They said I could do whatever I wanted to do.
The topic of today's discussion is big data. This is a frame I've had to do a number of times because.... are you still there? Okay. Good. One tidal wave after another. There's a surfer in this picture surfing the wave. I tend to do more dog paddling to keep afloat. This is a frame from a presentation by one of my favorite futurists, Gerd Lenhard, who says "data is the new oil." It's going to power the next generation of development, and I've slowly begun to understand that sentiment.
Basically, to frame this, John Wanamaker who a department store local said [On board.]
One of the things big data is trying to do is help people like Wanamaker identify which half works.
Big data is an interesting topic because we have been using data for a long time for customer relationships, for enterprise, for human relations management, but these have involved several terabytes of data.
IBM says that the problem with big data is that there are three V's. Let me make sure I'm coming through okay.
[Teacher reading: [On board.]
Once again, I'll define a zettabyte soon
[Teacher reading: [On board.]
Finally, [Teacher reading: [On board.]
One of jearly examples of big data you're all familar with is Google.
When you search in Google, you don't expect to sit and wait for five minutes to figure out what the results are. You want them instantly, and that's what people are more and more asking for when they start working with big data. The data comes in rapidly and it has to be processed and delivered rapidly to users
So volume, variety, and velocity are all three critical.
Another major factor affecting big data are cheaper forms of computers paired with social networking, which goes beyond twitter and facebook.
Text messages and all the things you see in everyday life- this is the social media side that collides with web search material, huge systems delivering information, and these two produce big data.
Big data, whatever you think of it as is growing 60% per year. When you start to think about it, twitter generates more than 7 terabytes a day, and facebook 10. Let's look at what I consider to be social networking.
Does my cursor show up? Okay. In 1 minutes, there are 690 search queries....
[Teacher reading: facts on board]
You can see this vast array of social networking that occurs, all of which produce huge amounts of data
This is a map of facebook. They took a subset of all facebook accounts and identified them by GPS< then connected them to friends. We've outlined the world with facebook accounts. This is how it takes in the world.
It does not extend into china or much of Russia, or central Africa, but other than that, it covers the whole world, especially the US and Europe. Brazil does not use facebook.
Professor: Do we have a technical problem?
Belford: so much for big data. Okay, I don't think that's us.
Professor: He'll probably be trying to get back online
Belford: there may be another way to get ahold of him
Professor: Well, while we're doing that, I handed out that list of questions for you to fill out, so now would be a good time to think of questions for the next homework. Anyone need a pen?
Male Student: Does zettabytes come after terabytes?
Professor: Good question. Write that down.
Belford: yeah, he's definitely offline.
Professor: Well, if you could bring up the slides we could go through them ourselves.
Belford: Yeah, where are they? In the dropbox? Okay, I do have them. Give me a second
Belford: my browser has resized so I can't see.... this is ridiculous. Okay, I can do that. Only when you're doing a presentation, huh? Andrew, do you have your cell phone with you? Can you find Harry Pence's phone number? He's in NY.
Professor: So we'll just go through his slides until he gets back online.
Belford: well, this is a shame because I know Harry really wanted to give this lecture
Professor: Anyway, here's the diagram showing the connectivity of facebook. What makes this possible is that memory is so cheap, and it's getting cheaper all the time.
A terabyte used to cost $14 million, but now you can get your own for less than $100.
60% per year. Does anyone recall the percept per year increase of.... 18 months- 2 years. The growth of data is increasing computers with Moore's law.
So, I don't know... this seems like a lot of memory, but it just means all the memory from twitter could fill up 7 external memory devices.
What does mega mean?
It's 1 level above kilo. It's a million. This is a million bytes.
So when I started teaching, the department got new computers that had 20 megabytes but now, laptops are measured in gigabytes.
Terabytes- trillion, petabyte- quadrillion.
I guess a zettabyte is quintillion, probably.
They're going to run out of words pretty soon.
Male Student: They'll invent more.
Professor: Haha, right. So you have a 4 drawer filing cabinet filled with tests, you've got a petabyte.
Now, I think that's a typo. I think it should say megabyte
[Teacher reading: [On board.]
That's not just novels and major works- that's all text.
Facebook containes more than 100 petabytes.
You know, that's a lot of data, but a lot of it is kind of junk data. You get junk mail, and that's data that isn't very useful.
So the amount of data is increaseing by leaps and bounds, but humans are kind of staying the same, so the amount of data available is growing faster than people can deal with it, which will lead to big data, which is a current hot problem in computing.
[Teacher reading: [On board.]
So we're dealing with quadrillions of copies of War and Peace.
Light goes 186,00 0 miles per second. It'd take 3 days to go to the top of that pile of books
So you can define.... big data is [On board.]
How do you analyze all the data? There's good stuff in there. How do you squeeze juice out of those huge amounts of data?
So you saw that map earlier of facebook connections- that sort of showed an outline of the world.
Going with that mapping concept, researchers at Harvard school of public health [On board.]
Malaria is one of the big scourges of man kind. Remember when we talked about toxoplasmosis?
Belford: I was trying to find a video of Harry, but I've got to find one. I'm failing, so excuse me
Professor: That's okay. Anyway, malaria is way worse than those diseases we talked about
Here's another example of big data- [On board.]
I think he showed this slide. Basically, there's lots of good stuff siting there and we only need to get to it and do something with it. Here's a question- [On board.]
It turns out the amount of money spent by candidates was comparable, but the Obama software worked better than romney software. I don't think that explains the election
Any high powered political candidate has to be comfortable with social media. That used to be email, but now it's reddit, facebook, and twitter.
Before, all of these things were handled by secretaries.
Do you know the difference between correlation and causality?
Supposing every time.... let's see.
I'm trying to think of an example.
Supposing you read that every time there's a full moon there is a higher crime rate. Does that mean the full moon causes higher crime?
Male Student: No, but there's a correlation.
Professor: Right, they're correlating, but no causality.
From other knowledge we have, we know one does not cause the other. But one conclusion is no more justified than the other.
So there's a strong tendency people have to be told about correlations and mistake them for causality. Like a social ill being correlated with social drinking, and they want you to think it's a cause, but it's not.
Maybe there's another factor they're not showing. Surely there's a correlation between teen drinking and dropout rates, but that doesn't mean one causes the other.
It's a good thing to know about
Okay, nevertheless they were able to find, you know, understand what was going on about the spread and intensity of a flu epidemic by looking for Google.
If you've ever gone to Amazon to look for a book and it shows you what books other readers have bought they do it by correlations to see what books people buy and how many people buy both, and they make suggestions.
How about this? Target has done some data mining and they've discovered a certain type of person is likely to be pregnant.
Here's another thing- the amount of data available will keep increasing because people are constantly on their phones, taking pictures, etc. It won't be long before everyone just has their own camera system that will record 24 hours a day and the world will be bathed in live videos and audio feeds. Maybe no live, but there will be tremendous amounts of what happens on video somewhere, and that's a lot of data. It's also a really strange social phenomenon when you think about it.
Driving monitors. How many of you drive? Would you agree to have a permanent recording device on your car that would take video out the window, record your speed, etc. for posterity if they offered $10 decrease insurance rates?
Male Student: I don't want them to track me GPS wise, but everything else is fine.
Male Student: I don't want them to know how many holes I hit in three minutes, or if I swerved or forgot my blinker- really though, just the GPS thing.
Professor: Do you all draw the line at GPS?
Male Student: I don't know if I'd do $10 a month. I'd rather have a 10% discount.
Professor: How about 50%?
Male Student: They could monitor my GPS then. That would be fine then.
Professor: Inexpensive sensor. We're talking about little cameras you carry around that are mounted to your cell phone. It kind of shows a sequence of power of sensors vs. time.
Ubiquitous position- there you go. GPS. That's available now and you can outfit your pet with an RFID chip.
Male Student: There's collars for it now.
Professor: Oh, I bet. And you can embed RFID chips under the skin of your pet. There's a movement among humans to put magnets in the fingertips
Male Student: There's a company that does that.
Professor: How many people would like to sort of put something like in their bodies?
Male Student: It depends on what it's for.
Male Student: If it's just my debit card information and it's just to swipe over something, I'd consider that. Or even my health information. A lot of people get that chip in their hands and the doctors just scan that right there in the ER.
Professor: If they could put a chip in your skin with that information, why not put in entire library of congress on the same chip and let you access it on your own private server. Of course, you can already access it through the web. No need to have it on a chip. But if the web went down or you didn't have Internet, I guess.
Male Student: See, I see the physical world web is on there- it's kind of like the Google blast that's coming out- just being able to look up and being able to check your email and contact someone. My favorite thing is they have where you can set up.... like say I'm going to walk from UALR to the mall, and it will make a GPS coordinate and give you directions in a streetview in the corner of your eye.
Male Student: It's like Google streetview in glasses
Professor: Cool. Why not put it on glasses?
Male Student: Google doesn't officially support that part. But basically, your trail would be highlighted in green when you looked at it because it's a GPS. So it's like mario kart.
Professor: My question about Google glasses is how do you focus on it?
Male Student: The applications are all above and you just look up to it.
Professor: It's hard to look at something so close.
Male Student: Have you seen the video for it? I know it's something like a timed response- you look at it for three seconds and it comes up. It's Google.
Professor: Have you heard anything about that?
I'm very enthusiastic about them. I want to try them.
Male Student: They're only like $250.
Professor: Oh, okay. So here's where we are. Telepresence is... you've heard of telecommuting where you work at home. But telepresence is where you operate a robot at your work place and you can work it at home. You have an actual robot avatar that you can control.
A lot of people have been talking about annotating the information on the web in a semantic web which is based on meanings, not just links.
People were talking about it for quite a few years, but how far or fast it'll go, I don't know.
[Teacher reading: [On board.]
If you're operating your avatar robot at your place of work while you're sitting at home, that's not accessing information over the Internet- it's also controlling something physically.
Male Student: Have you ever seen surrogates with bruce Willis? You don't do anything. You wake up in the morning and log into your own personalized robot and you just take that robot to work and if the robot dies you get a new surrogate. The premise of the movie was when the surrogate died you would die. You look like you in your 20's when you're in your 50's. It was pretty cool.
Professor: Alright. Here's more of these.
[Teacher reading: [On board.] Yattabtye is sextillion and Bronobyte is [On board.]
You know what these exponents mean, right?
Male Student: Scientific method for measuring the value.
Professor: Right. We talked about big data and social network, but what about scientific data here's a rendition of a radio telescope which could be used to peer into the universe.
Okay. [Teacher reading: [On board.]
I think this is a good quote because it explains why people care about big data. It's because there are techniques for doing it better to identify patterns from data to extract new information that we would have never thought of otherwise.
That will make it increasingly possible to extract facts of sciences itself.
Right now, scientists have to think really hard to think of new ideas.
Here's an example of how that might play out. Have you heard of the large Hadron Collider?
Male Student: I've heard that there's like a certain millimeter distnace that if fired wrong could create a black hold.
Professor: Yeah. And that would suck in the earth and cause it to disappear.
Male Student: That would be a fun apocalypse. What would happen to the surface as everything else is pulling in.
Male Student: There would be a lot of funding for space travel at that point.
Professor: This is a huge device. It produces huge amounts of data.
We talked about how weather is a complex simulation process using huge computers and lots of data from weather sensors. It gets crunched in a big computer for forecasting.
Future of healthcare.
Dr. Agawal, you're doing something in this right?
Agawal: we're trying to see how data from social media can be used to place certain health problems. For example, using social media to find out what kind of health problems exist where and to learn more about the behaviors of the people in that area.
Agawal: we're working with UAMS to further expand this analysis to see how we can develop intervention strategies. Those interested, your professor has my contact information.
Professor: Smart phones blow my mind. My kids have them, but I don't. A smart phone is a pocket computer that does everything. It's a GPS and they're modifying them to do medical tests.
Really, it's not really a phone at all. It's a general purpose tiny little computer.
Genome analysis is another example of the use of information technology in healthcare.
What's a genome? It's your complete set of DNA.
DNA is like a string of characters that describe your genetic characteristics. A human being has about 3 billion characters in their sequence, and yours is almost like mine, but not quite the same.
Every cell has the same 3 billion.
[Teacher reading: [On board.]
It happened before that. They spent about a billion dollars to do that
Male Student: That's personal, but the human genome project itself is billions and billions.
Professor: Now, if you have $10,000, you can get it done. If you want to wait a few years, $1,000 will do it. Once it hits $100, it'll be a standard part of your medical record
Agawal: there's a website where you can send your own saliva sample and they'll do a sequencing. They'll also upload your DNA sequences and make it publicly available, and researchers can use that to assess risks that may be there in terms of health. 23andme.com. One of the wives is a co-founder of Google.
Professor: Neat. I know they have things like that that tell which parts of the world your ancestors are from. Anyway, in the future your doctors will probably just have this information on you when you go to the hospital.
So it's not going to be long before your genome is on file with your doctor. There you go.
[Teacher reading: [On board.]
So we get down to $100 per 3 billion DNA letters per person. The problem will be what to do with the data. How do you process all that data to get results?
Okay, so this gets back to one of the things Pence talked about- the three V's.
So in the old days databases were relational databases. But now, there are so many kinds of databases that it becomes an issue with all the file types ranging from images to audio to everything.
So that's the variety issue. The volume issue [On board.]
And velocity- you know, how fast is the data being used? The first computer programming I took was in punch cards. Now it's in real time.
So volume of data. [Teacher reading: [On board.]
[Teacher reading: [On board.]
And velocity- how fast is the data flowing?
So, you know, the early search engines were the pioneers in dealing with big data because they had to index the entire web. These companies have an index of the entire web, and that's a lot of data.
Google has.... it was a latecomer to the search engine company business, but they had some great algorithms that were better than anybody elses.
The first one for analzying the web- what was that called?
What was Google's original algorithm?
Professor: Oh yeah. So here again we're back to data using relational databases, but that won't cut it in the real world where we have RTF, XML, etc.
You probably have heard of MySQL. There's another called Hadoop for non-SQL databases.
So it doesn't have to be in tables and so on.
Male Student: I recently, in my occupation, went through Hadoop training, and it's crazy how it processes data.
Professor: So what kind of data do you use it with?
Male Student: I work for Axiom, so mostly marketing stuff. Just the overall architecture of what it sets up- it knows if it's sitting on one rack, it knows that the closest data node to do a processing job is two racks over, so it knows exactly where to send it.
Professor: Yeah, so it's very concurrent with computing. This is an example of a data visualization. Data visulaization is cool. It summarizes data in a very neat way.
I think this is frequency of search terms over time.
It shows different search terms and the frequency at which they're occurring. Looks like food is pretty popular in this body of data.
Okay, so this is conversation about a particular brand- a brand of food.
Okay, how about fracking? The oil industry uses a lot of data.
I'm not sure what to say about this. I'll just mention that Arkansas has a lot of natural gas that's being extracted right now.
Okay, so the problem with big data is what do you do with it? You can process it to get smart data and get useful stuff from it.
How do you extract useful stuff from big data? Machine learning is a part of AI- something called a data scientist is someone who can use machines to extract data. A field called data science is getting a lot of attention right now.
There are companies doing things like scraping things off the web, processing it, and selling it.
This guy says it only costs $120 to analyze and visualize 220 facebook profiles. This is some technical stuff about statistical buys.
This is interesting. This is twitter users and... do you know what this was all about, Bob?
I guess twitter users subscribe to other users, and they form in communities.
Belford: have you heard of the flipped classroom? It's where the lecture is outside of the class, and the professor talks on youtube, then you come to class and do problem solving.
Professor: What do you all think about that? I grew up going to class and listening to lectures. I try to get interactive....
Belford: I'll ask harry to post this talk online. I think that would allow you folks to hear what he says.
Professor: I like the idea of the flipped classroom. What do you think about that? Listening to the lecture online?
Male Student: I could watch youtube videos. Do you have any professors who do that?
Well, it's extra work. To get something out of it, you have to put something into it.
Male Student: I think in a setting like this it would work. I think for calculus or some of your other science courses where a lot of material is very structured as far as this is how it is, but for programming classes where it's more abstract, I think it would work more efficiently.
Professor: I'd like to hear more from Dr. Pence I introduced him....
Belford: can I ask a question? I was on the news before and I think ASU is requiring all incoming freshmen to have ipads.
Male Student: Are they buying them?
Male Student: I think that's terrible.
Belford: if you take general chemistry, you pay $326 for your general chemistry book. And then at the end of two semesters, what are you doing? Of you bought an ipad you could access your book on the ipad.
Male Student: So are the textbooks free there?
Belford: no, but if you moved into digital textbooks- that's what this course is about. There are things where you can get free online textbooks and stuff. You can get the information you need on that little device.
Male Student: Let me propose an alternative- we've got these thin clients that the university paid for. If I wanted to access my ebooks- because all of my books are in PDF form or on Cengage. I can access them through what the university provided. I don't think the student should buy the ipad.
Belford: but your education is only good while you're at school.
Male Student: I understand what you're saying.
Professor: I'm only standing up so I can see you.
Male Student: I think the ipad is too strong.
Male Student: I can purchase all my books through Amazon, and the chromebook is only $250. It's just the same. It has to be connected. It's just cheaper.
Professor: Why not use a smart phone?
Male Student: I'd never want to read on a smartphone.
Belford: I want to apologize to you folks for whatever is happening. What we should have done was we should have gotten a land line. We did not have a plan B. Never do anything without a plan B. That's your lesson today.
Professor: Well, we're on Plan C now. Here's another domain for big data- education.
I'm just going to go off- how many have you have heard of MOOC?
Massive open online courses. There are companies now which are producing those courses. A couple of years ago they started these and hundreds of thousands of people signed up for these courses, and three of those professors left Stanford to start these courses. One of them started Udacity.
Right? Two of them started another, and one started Udacity. They're just companies there to provide massive courses.
And UALR is running scared right now. You don't hear what the faculty and admin talk about, but they're talking about how we're going to bring MOOc's to be a part of the UALR experience so we can compete with other universities. I don't know if any of you have taken these courses, but that's where these things are going. Much of what students do here will be done through online coureses.
Does anyone see any problems with that? With going to online courses?
Male Student: I see problems in certain courses that were offered online. You can't develop lab skills with your hands- dissecting a frog or something- with a computer.
Professor: It's not the same watching an interactive video.
Male Student: A lot of my online classes feel like high school because they just give you busy work. I'm not a fan of that kind of thing.
Professor: There are questions about quality issues, but it's so much cheaper.
In many colleges, students live on campus for four years a. Lot of you probably have jobs, and some of you maybe don't.
Online education is going to be a lot cheaper, and that will compensate for reduction in quality. People will tolerate reduction in quality for a reduction in price. I think UALR is a little scared right now about that.
Many courses will be amenable to this.
Male Student: I know Udacity- I know they're currently looking at making concurrent credit through these courses at regular colleges.
Professor: I don't think becoming a professor is a part of a growing job market.
I think we can stop here. Thank you for listening. I'm sure pence is sorry to be disconnected.
Male Student: Where should we turn in the questions?
Professor: I think we'll just dispense that. If you filled it out I'll take it though.