Friday, November 7, 2008

Unit 10 Reading Notes

Web Search Engines: Part 1 & 2 reminded me just how interesting it is that search engines can get us fast, accurate results, with minimal spam. Spam filters are getting so much better, and the search engine algorithms are improving all the time. It's a complicated system, and there are so many things to be aware of-- the politeness aspect was one I hadn't thought of much, but it's important. I know a bit about SEO, so not all of this information was new to me, but these readings were clear and pretty easy to understand.

Current developments and future trends for the OAI protocol for metadata harvesting - Not really familiar with OAI, but has to do with metadata, searching databases, and so on, to create better access to information.

The Deep Web: Surfacing Hidden Value - Good comments on searching, the limitations of searching, the problems we face trying to both narrow results and provide access to as much information as possible, and the difference between surface content and the deep web. 750 terabytes is a massive amount of information! I'm not surprised that some of the biggest deep web sites are government sites, either. Interesting comment on how "higher quality" really means "did I get what I wanted?"

Friday, October 24, 2008

Muddiest Point Week 8

I have a question about searching within a site. Why is it that the search feature of so many sites just doesn't work well? I nearly always end up using Google to search within a site. Is it inexperienced programmers? The massive resources of Google? Something else?

Unit 9 Reading Notes

Introducing the Extensible Markup Language (XML)

-XML helps the user combine documents, identify formats, add comments to files.

- XML tags are stuctured and logical, moreso than similar features used by word processors.

- It's clear where the tags begin and end.

- You can create your own tag sets using DTD.

- XML files are easily stored in databases and are easily changed to fit any database.

-XML files are easily transferred to many different kinds of hardware and software. The files will not get outdated quickly, and will only need to be updated.



A survey of XML standards

- XML has different sets of standards, which can make it difficult for new users to determine which standards to use. Standards can be complex. He adds that XML has a lot of components and can intimidate new users.

- This article focused on what the author thought of as the "core XML technologies."

- Some updates to XML are controversial. Some believe changes are not worth it if the benefit is too small.

- The article briefly summarizes several types of XML systems. XInclude is a way to combine XML documents and break them into smaller chunks. XLink is a way to use links that are more complex than basic HTML links.

- I found this one a little harder to follow because I'm not familiar with the many systems he discusses.



Extending your Markup: a XML tutorial by Andre Bergholz

- Starts off by commenting that XML is said to be simple and also does everything you need. He says that this is partially hype, but that XML is useful once you get past the hype.

- Provides several examples of XML and compares these examples to HTML.

- You can learn most of what you need to know about XML online.

- Explains in more detail the difference between using links in HTML and XML, for example, XML links can lead to a specific section of a document, whereas HTML URLs only direct the user to a specific document.

- Goal of XML Schema is to replace DTDs, but only time will tell whether it can.

- I found this reading more beginner-friendly than the previous one because it contained more explanations right in the text rather than primarily linking to other resources.



XML Schema Tutorial

- Tutorial about how to read and create XML Schemas. Like the previous reading, it shows how XML Schema can now be used instead of DTD, and this tutorial aims to show the ways XML Schema is superior.

- Showed differences between XML Schema and DTD. (Supports data types and namespaces.)

- The tutorial went pretty in-depth, I'm not sure how much of the information I absorbed-- and this is only one aspect of what I'd need to know.

- Like the previous tutorials, I found them useful, but can't help being distracted by the ads and the sidebar crowding the page on the right.