Sunday, January 29, 2012

Week 4: Reading Notes

1) Database: http://en.wikipedia.org/wiki/Database

Databases - organized collections of data, referring to the data itself, not a database management system (DBMS)

Popular DBMSs: Oracle, Microsoft SQL Server, My SQL, and SQLite

Database languages: data definition languages (DDL), data manipulation languages (DML), query languages

Popular database language: SQL - which combines functions of DDL, DML, and query languages

Relational Model
-applications should search for data by content, not by following links
-somewhat limited depending on data type, relational model venders extended their services to support a larger variety of data types

General-Purpose DBMSs
-not always optimal, when considering certain specialized jobs
-DBMS developers, application developers, database administrators, and application end-users are those involved with a general-purpose DMBS

Examples of Database Types:
-Active database
-Cloud database
-Data warehouse
-Distributed database
-Document-oriented database
-Embedded database

Data Models: provides a way to use data structures needed to model an application
-Hierarchical
-Network
-Relational
-Entity-relationship

Database Architecture: may be considered an extension of data modeling.
-External level - how each end-user understands data organization
-Conceptual level - takes all external views and organizes into one coherent view
-Internal level - concerned with database implementation

Database Security
-Access control
-Data security
-Database audit

DBMS Architecture: specifies components and their interfaces
-external interfaces
-database language engines
-query optimizer
-database engine
-storage engine
-transaction engine
-DBMS management and operation component

Database Transactions
-All transactions obey the following rules: ACID (Atomicity, Consistency, Isolation, Durability)

2) Entity relationship model in database: http://en.wikipedia.org/wiki/Entity-relationship_model

Entity relationship model (ER) - abstract and conceptual representation of data
-ER diagrams are drawn with “rectangles to represent entities, and diamonds to represent relationships.”

Semantic Model - a model of concepts

Crow’s Foot notation - boxes (instead of rectangles) and lines (instead of diamonds)


3) Database Normalization Process: http://www.phlonx.com/resources/nf3/

Database Normalization Process:

Three forms to memorize:

  1. No repeating elements or groups of elements
  2. No partial dependencies on a concatenated key
  3. No dependencies on non-key attributes

Sunday, January 22, 2012

Week 3: Reading Notes

1) Anne J. Gilliland. Introduction to Metadata, pathways to Digital Information: 1: Setting the Stage http://www.getty.edu/research/conducting_research/standards/intrometadata/setting.html 

-Metadata is "data about data."
-Metadata isn't a familiar a term to basic users, although they "are increasingly adept at creating, exploiting, and assessing user-contributed metadata such as Web page title tags, folksonomies, and social bookmarks."
-All information objects have content, context, and structure.

Data structure = MARC
Data value (controlled vocabularies) = LCSH (Library of Congress Subject Headings)
Data content (cataloguing rules) = AACR, RDA
Data format/technical interchange (manifestation of a data structure) = MARC21, MARCXML

-Context is important to archivists, especially in a museum setting.
-More to metadata than description and resource discovery? (i.e. exhibition catalogs, acquistion records, licensing agreements, educational metadata)
-user-created metadata, folksonomies

Different types of metadata:
-Administrative
-Descriptive
-Preservation
-Technical
-Use

Why is metadata important?
-Accessibility
-Retention of context
-Expanding use
-Learning metadata
-Legal issues
-Preservation

2) Eric J. Miller. An Overview of the Dublin Core Data Model http://dublincore.org/1999/06/06-overview/

-Dublin Core Metadata Iniative (DCMI) strives for consensus for discovery-oriented descriptions across disciplines.

DCMI requirements:
-Internationalization
-Modularization/Extensibility
-Element Identity
-Semantic Refinement
-Specification of controlled vocabularies
-Identification of structured compound values

3) Working with Endnote, http://www.hsl.unc.edu/Services/Tutorials/ENDNOTE/intro.htm

-bibliographic software program
-select a reference library
-choose different citation styles
-sort, find and view references
-Cite While You Write (CWYW) is an Endnote feature accessed in Microsoft Word. You can insert citations at any time during the writing process.
-Instant Formatting

Sunday, January 15, 2012

Week 2: Reading Notes

1) Computer Hardware: http://en.wikipedia.org/wiki/Computer_hardware and Computer Software: http://en.wikipedia.org/wiki/Software

Mini Computers
-middle range of computer systems (between large multi-user systems such as mainframe computers and small single-user systems such as microcomputers or personal computers)
-also known as “mid-range computer”
-in 1960s, minicomputers usually took up the space of one or two refrigerators, compared to a mainframe system taking up an entire room
-”The first successful minicomputer was Digital Equipment Corporation’s 12-bit PDP-8, which cost from US$16,000 upwards when launched in 1964.”
-In the 1990s, change began to move from minicomputers to inexpensive PC networks.
-Microsoft Windows, beginning with Windows NT, supported multitasking and other features required for servers
-“…although today’s PCs and servers are clearly microcomputers physically, architecturally their CPUs and operating systems have evolved largely by integrating features from minicomputers.”

Personal Computers
-Personal Computers, or PCs, are meant for the average end user
-Software on PCs include: word processing, spreadsheets, databases, Web browsers, e-mail clients, digital media playback, games, etc.
-Early PC users had to write their own programs to do anything with their machines.
-"Since the early 1990s, Microsoft software and Intel hardware have dominated much of the personal computer market, first with MS-DOS and then with the Wintel platform. Alternatives to Microsoft's Windows operating systems include Apple's Mac OS X and the open-source Linux OSes. AMD is the major alternative to Intel's central processing units.”
-"In July and August 2011, marketing businesses and journalists began to talk about the 'Post-PC Era', in which the desktop form factor was being replaced with more portable computing such as netbooks, Tablet PCs, and smartphones.”

History
-"In what was later to be called The Mother of All Demos, SRI researcher Douglas Engelbart in 1968 gave a preview of what would become the staples of daily working life in the 21st century - e-mail, hypertext, word processing, video conferencing, and the mouse. The demonstration required technical support staff and a mainframe time-sharing computer that were far too costly for individual business use at the time.”
-Early PCs, or microcomputers, were mostly of interest to hobbyists and technicians. “Practical use required adding peripherals such as keyboards, computer displays, disk drives, and printers.”
-In 1982, “The Computer” was named Machine of the Year in Time Machine.

Hardware
-Mass-market computers use standard components and are easily assembled.
-“A typical desktop computer consists of a computer case which holds the power supply, motherboard, hard disk and often an optical disc drive. External devices such as a computer monitor or visual display unit, keyboard, and a pointing device are usually found in a personal computer.”
-"The motherboard connects all processor, memory and peripheral devices together. The RAM, graphics card and processor are mounted directly onto the motherboard. The central processing unit microprocessor chip plugs into a socket. Expansion memory plugs into memory sockets. Some motherboards have the video display adapter, sound and other peripherals integrated onto the motherboard. Others use expansion slots for graphics cards, network cards, or other I/O devices. Disk drives for mass storage are connected to the mother board with a cable, and to the power supply through another cable. Usually disk drives are mounted in the same case as the motherboard; formerly, expansion chassis were made for additional disk storage.”

Mass storage
-Floppy drive…zip drive…optical drive…different methods of storage appeared as memory sizes increased.
-USB drive is typically the external storage device used today
-“The operating system (e.g.: Microsoft Windows, Mac OS, Linux or many others) can be located on any storage, but typically it is on a hard disks. A Live CD is the running of a OS directly from a CD. While this is slow compared to storing the OS on a hard drive, it is typically used for installation of operating systems, demonstrations, system recovery, or other special purposes."

Operating System
-“An operating system (OS) manages computer resources and provides programmers with an interface used to access those resources.”
-Common OSs: Microsoft Windows, Mac OS X, Linux, Solaris and FreeBSD.
-“Unix was developed at Bell Labs beginning in the late 1960s and spawned the development of numerous free and proprietary operating systems.”
-“Linux is a family of Unix-like computer operating systems. Linux is one of the most prominent examples of free software and open source development: typically all underlying source code can be freely modified, used, and redistributed by anyone.”

Computer Architecture
-"In computer science and engineering, computer architecture is the practical art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals and the formal modeling of those systems.”
-“The noun computer architecture or digital computer organization is a blueprint, a description of the requirements and basic design for the various parts of a computer. It is usually most concerned with how the central processing unit (CPU) acts and how it accesses computer memory. Some currently (2011) fashionable computer architectures include cluster computing and Non-Uniform Memory Access.”

Performance
-“In a typical home computer, the simplest, most reliable way to speed performance is usually to add random access memory (RAM). More RAM increases the likelihood that needed data or a program will be in RAM. So, the system is less likely to need to move memory data from the disk. The disk is often ten thousand times slower than RAM because it has mechanical parts that must move to access its data.”

Operation
-CPUs execute instructions called a program. The four steps of operation are typically: fetch, decode, execute, and writeback.

Computer Data Storage
-core function of computers
-primary storage (internal memory)
-secondary storage (external memory)
-off-line storage (data that is recorded in a secondary device, then disconnected from the machine)

Input/Output
-“In computing, input/output, or I/O, refers to the communication between an information processing system (such as a computer), and the outside world, possibly a human, or another information processing system. Inputs are the signals or data received by the system, and outputs are the signals or data sent from it."

http://en.wikipedia.org/wiki/Software

-Computer software: a collection of computer programs and related data
-Alan Turing, in 1935, proposed first theory about software
-"Colloquially, the term is often used to mean application software. In computer science and software engineering, software is all information processed by computer systems."

-System software are usually basic functions for average computer use, assisting the computer to function. Includes: device drivers, operating systems, and utilities.
-Programming software assists programmers in writing computer programs. Includes: compilers, debuggers, and text editors.

2) Scanners and Digitization: Stuart D. Lee. Digitization: Is It Worth It? http://www.infotoday.com/cilmag/may01/lee.htm

-“…the real cost of digitizing and delivering a printed black-and-white, letter-sized page at 300 dpi 1-bit could be as high as 54 cents.”
-“…in summary the listed advantages offered by digitization tend to come under the headings of increasing access, preservation, and meeting strategic goals (i.e., raising the profile of the institution running the project, and so on)”
-“Above all we should not be forgetting that our primary aim is to meet the requirements of the readers and to provide them with the resources they really need to use.”
**The debate on digitization is ongoing, but the author’s last point—considering the readers’ needs—is sound advice for the future of the field.

 3)  Doreen Carvajal. European libraries face problems in digitizing. New York Times. October 28, 2007 http://www.nytimes.com/2007/10/28/technology/28iht-LIBRARY29.1.8079170.html

-lack of funding for digitization
-considering different business models to pay for costs
-possibility of entering into a contract with Google for digitization projects
-a feeling of hesitancy among libraries and museums for entering into contracts with corporations, in order to protect full public access…the downside is that money is still needed to continue public access


-encode with fewer bits than the original file would use
-“Compression is useful because it helps reduce the consumption of expensive resources, such as hard disk space or transmission bandwidth. On the downside, compressed data must be decompressed to be used, and this extra processing may be detrimental to some applications.”
-Audio compression reduces bandwidth of digital audio streams and the storage size of audio files.
-Video compression: most is lossy, DVDs use a standard coding called MPEG-2
-“Most video compression is lossy: it operates on the premise that much of the data present before compression is not necessary for achieving good perceptual quality. For example, DVDs use a video coding standard called MPEG-2 that can compress video data by 15 to 30 times, while still producing a picture quality that is generally considered high-quality for standard-definition video.

Tuesday, January 10, 2012