DATA WAREHOUSE AND ITS APPLICATIONS IN AGRICULTURE KP Satish Wagh Dr. R. Assistant Professor GF Kolher Reader GCOE Kishorwagh2000 Jalgaon Jalgaon Re @ yahoo. srkolhe2000 com @ gmail. com A data warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated. This makes it much easier and more efficient to run queries on data that originate from different sources. In other words data warehouse is a database used to hold data reporting and analysis. economic base and productivity growth depends on agriculture. Agriculture is the engine of how to live and a source of income for most people. Over 60 percent of the population live in rural areas and the majority are farmers. Rural communities as the main producer of food productivity and food security of the country earn only 11 percent of gross domestic product (GDP). The arrival of the Information Age guide this country to new development strategies. National Electronics and Computer Technology Center (NECTEC), in collaboration with the Ministry of Agriculture, launched the Agriculture Information Network “as a response to unmet information requirements of the agricultural sector. Farmers should benefit from the proposed acquisition of content, including risk assessment, warning system of agriculture and agricultural knowledge base, designed to improve technology, productivity, income and stability agricultural sector in India until the age of information technology. The data warehouse consists of common databases and databases for geospatial data from various departments and agencies in the country and abroad. Farmers can get access to content through the Internet by themselves or by groups of professionals called “information brokers”. Keywords: data warehouse, agriculture, IT 1. Introduction A data warehouse [1] is a repository of integrated information, available for queries and analysis. The data and information are extracted from heterogeneous sources as they are generated. This makes it much easier and more efficient to run queries on data originating from different sources. In other words data warehouse is a database used to hold data reporting and analysis. Objectives of the Data Warehousing To facilitate reporting and analysis of organizations to maintain a Be a source of historical information for adaptation and resilience of information provide the basis for decision-making data warehouse architecture data warehouse systems architecture includes source operating a transfer zone data One or more A consistent data database data marts source operating systems warehouse source systems Operational expenditure [1] are designed to capture and process transactions for commercial origin. These systems are designed for data entry, no statement, but it is by the data in the data warehouse is populated. Data staging area transit data is where the raw data of operations is extracted, cleaned, processed and combined so that it can be account and requested by users. This area is located between the source and the operational systems of database users and is not generally accessible to users. the staging of data is an important process that includes the following procedures under: Extraction The sample stage is the first step of obtaining data in the data warehouse environment. Extracting means reading and understanding the data source, and the copy of The Not necessary for staging data for further work. Transformation Once the data extracted in the area of data collection, there are many processing steps, including 1. Cleaning the data by correcting misspellings, conflict resolution domain, dealing with missing data elements, and analysis in standard formats. 2. Purging selected fields from the legacy data that are not relevant to the data warehouse. 3. The combination of data sources by matching exactly on the core values or by making connections on non-key attributes. 4. Creation of surrogate keys for each record size in order to avoid dependency on the legacy defined keys, where the process of generation of surrogate keys apply referential integrity between dimension tables and fact tables. 5. Construction aggregates to boost the performance of common queries. Loading and indexing at the end of the transformation process The data are recorded as images load. Loading in the data warehouse environment usually takes the form of reproducing the dimension tables and fact tables and these tables present bulk loading facilitates each recipient of the data mart. Bulk loading capacity is very important is to compare the recording to a loading time, which is much slower. The Mart target data must then index the newly arrived data query performance. Data Mart Data Mart is a logical subset of a data warehouse scalability. For example, a data warehouse for a chain of retail is built gradually person, data marts line dealing separate disciplines such as product sales. Dimensional data marts are organized by topics such as sales, finance, marketing and coordinated data categories such as customer, product, and location. These stores allows flexible information of data structures to respond to additions of online business product changes, new staff responsibilities, mergers, consolidations and acquisitions. Data Warehouse Database A database of data warehouse contains data that is organized and stored in particular for direct user queries and reports. It differs from an OLTP database in the sense that it is designed primarily for reading unwritten. An OLAP application is a system designed for some and complex (read-only ) request. An OLTP application is a system designed to many simultaneous requests, but simple (and updating). Metadata describes the content and location data in the data warehouse, relationships between databases operational and warehouse data and views of business data in the data warehouse in the most accessible to the end user tools. The metadata is searched by the user to find the areas and data definitions . For decision support, pointers needed to store data supplied from the metadata. Therefore, it acts as a logical link between the application of decision support system and data warehouse. Thus, n does it matter what concept of data warehouse must ensure that there is a mechanism that meets and maintains the metadata repository and that all paths to data warehouses contain metadata as an entry point. In other words there should be no direct access allowed to the data warehouse data if it does the metadata definitions of the user to access. Meta Data Definition can be made by the user in any data warehousing environment. L software environment as determined by the software tools used will provide a facility for defining metadata in a metadata repository. OLAP Vs OLTP OLTP (Online Transactional Processing) OLTP servers handle critical production data accessed by simple queries handles customer complaints an automated applications OLTP kind consisting of a large number of relatively simple operations. In most cases, contains data organized based on logical relationships between the normalized tables • OLAP (Online Analytical Processing) servers handle OLAP Critical data management accessible through an iterative survey analysis usually handles matters of ad hoc nature supports more complex and demanding transactions contains logically organized data in multiple dimensions 2. Warehouse Design patterns Dimensional Modeling is a term used to describe a set of data modeling techniques that have gained popularity and acceptance of data warehouse implementation. dimensional modeling is one of the key technologies of data warehousing . Two types of tables are used in dimensional modeling: fact tables and dimension information tables These tables are used to record actual events and actions in the company. The facts are the elements of digital data are of interest to the company. Example, telecommunications, call duration in minutes, the average number of calls. Tables dimensional dimensional tables establish the context of the facts. Dimensional tables store fields that describe the facts. Example The original destination of telecommunications by telephone. A schema is a fact table in addition to its associated dimension table. 3. Crucial decision in designing a data warehouse design work and development implement a data warehouse [3] is very difficult and challenging, even if at the same time, there is great interest and importance attached to it. The designer of the data warehouse can be requested by management: ‘take all corporate data and build a data warehouse such as management can not get the answer to all their questions. This difficult task is the responsibility of being visible and exciting. But how to begin? Where to start? What data should be first? Where is that data available? What application must be an answer? How to reduce the scope of the project to something smaller and manageable, but adaptable to gradually upgrade to upgrade the data warehouse environment full length? The recent trend is to build data marts for before a real big data warehouse is built. People want something small, so as to obtain manageable results prior to the actual data warehouse. Ralph Kimball discovered a nine-step method as follows: Step 1: Choose the subject. Step 2: Determine how the fact table represents. Step 3: Identify and confirm the size. Step 4: Select the facts. Step 5: precomputation store in the fact table. Step 6: Set the size and tables. Step 7: Define the term database and frequency of updation. Step 8: Follow the slowly changing dimension. Step 9: Determine priorities and modes of application of the query. All the above steps are necessary before the data warehouse is implemented. The final step or step 10 is implemented in a single data warehouse or data mart. The approach must be “simple to complex. First a few data marts are identified, designed and implemented. A data warehouse, and then appear progressively. Let us discuss the steps mentioned above in detail. The user interaction is essential to get answers to many questions. The user has to be questioned senior managers, middle managers, executives, business users too, in addition to the sales force and marketing. A clear picture emerges of the overall project on the storage of data on what their problems are and how they can possibly be solved with the help of data warehousing. 4. Various considerations of technology or technological issues following [3] must be considered when designing and implementing a data warehouse: 1. The hardware platform for Data Warehouse 2. DBMS data warehouse to support 3. Communication and network infrastructure for a 4 Data Warehouse. The management system / platform operating system 5. Software tools for the construction, operation and use of data warehouse hardware platform Organisation normally tend to use the hardware platform that already exist for data warehouse development, however, the disk storage requirements for data warehouse will be quite important, especially in comparison with a single application. If the data warehouse or Data Mart is a small data server Pentium normal will probably adequate standards of reliability is not very high. However, due to a large data warehouse server must be specialized for tasks associated with a data warehouse. A central computer, for example, is well suited for this purpose as a data warehouse server. What are the characteristics required for a server data warehouse success? First, it should be capable of supporting large volumes of data and processing of complex queries. In addition, it must be highly scalable. As the population of users continues to grow, network traffic and increasing access traffic significantly. Therefore, the server’s warehouse is the data requirements for scalable high performance data loading and processing ad hoc queries and the ability to support large database of reliable and efficient manner. If the query will be a large public data network, while the multiprocessor configuration is required for parallel query processing. In the case of a complex server configuration with multiple processors and high bandwidth I O a balance must be made between the I / O and processing power. DBMS selection of hardware solutions alongside the most critical factor is the selection of database. This determines the speed performance of the data warehousing environment. The requirement of a DBMS for data storage and scalability requirements and storage are high-volume processing and the flow of traffic. Most RDBMS vendors have established implemented various degrees of parallelism in their products. Even if all suppliers have implemented varying degrees of parallelism in their products. Although all well-known suppliers, IBM, ORACLE, SYBASE-dependent processing of parallel databases, some of them improved their architecture to better meet the specialized requirements of the data warehouse. The RDBMS products provide additional modules for OLAP cubes. The correct choice of the OLAP server DB server and web server can be made by the designer or user of the data warehouse as needed. Communication and networking infrastructure data warehouse can be enabled or internet intranet has the choice may be. While the Internet has enabled the networking is supported by the Internet. If only based Intranet LAN environment and operational measures should be accessible to all users identified. Thus, the expansion of the network may be required as needed. In data warehouses developed web, issues of security and confidentiality and accessibility must be carefully considered. Therefore web enabling facilities should be provided in the software tools used for data warehouse development. The steps for implementing a data warehouse can not be purchased and installed. Its implementation requires the integration of the implementation of many products. Here are the steps of the implementation of data warehouse: Step 1: Collect and analyze the needs of the business. Step 2: Create a data model and physical design and data warehouse before deciding on the appropriate hardware platform. Step 3: Define the data source Step 4: Select the DBMS and platform software for data warehouse. Step 5: Extract data from operational data sources, translating, cleaning and load the model data warehouse or data mart. Step 6: Select database access and reporting tools. Step 7: Choose the connectivity software database. Step 8: Select the data analysis (OLAP) and presentation (GUI) software. Step 9: Keep refreshing the data warehouse periodically. Access Tools With the exception of SAS (SAS Institute), all data warehouse / OLAP vendors are not currently complete software tool to stop that manages all aspects of data warehousing implementation project. SAS only meets the requirement largely independent because it has its own internal database with a capacity to import data from a supplier of database software. Therefore, we can implement a data warehouse and data mining solution independently with SAS. The best way to choose a set of tools is to understand the capacity and compatibility of different types of access to data and reports by selecting the best tool on the market for this kind of access. The types of access and reporting are the following: 1. Time series analysis 2. The visualization of data, graphs, charts and pivot 3. Complex text search (text mining) 4. General Analysis stastical. 5. artificial intelligence techniques to test hypotheses, the discovery of trends, identification and validation data sets and segments (also useful for data mining) 6. Mapping information specifial in GIS 7. Ad hoc queries specific to the user 8. Predefined queries reproducible 9. Drilling interactically 10. The analysis of reports by drilling down 11. complex queries with multi-table forces at several levels of sub-applications, sophisticated search criteria. In some applications, user needs may exceed the capacity of tools. A number of research tools are available on the market today that allows an ordinary user to easily build custom reports in composition and execution of ad hoc queries, without any need for knowing the details of design or technology database, SQL, or even the Data 5 model. Its applications in Agriculture Project: Information System on Agriculture Network (AGRISNET) Department of Agriculture and Cooperation (DAC) [2] have taken steps to establish “Agricultural Information System Network (AGRISNET) “in collaboration with NIC. The proposal recommends (i) conditions the state of the art of putting in place the infrastructure as AGRISNET ii INTRANET more NICNET () development of databases and systems information for decision support for evaluating, monitoring and policy formulation, and (iii) human resource development, (iv) based multimedia training and demonstration of transfer of technology to enhance farm research and education using VSAT broadcast (v) interest groups with respect to the subjects, problems, programs, plans, etc., and above all, to make Indian agriculture in line for Internet and Intranet access through AGRISNET nodes. AGRISNET Nodes are considered to be established at Hqrs • CAD (Krishi Bhawan) • CAD offices and attached offices, subordinate offices • CAD and its regional units, public • Businesses DAC (NSC & SFCIs) and sub-units, • CAD autonomous organizations, apex organizations • • • State Agriculture Departments NCT / UT Agriculture • Departments and District Offices and Agriculture • Block offices Agriculture In this sense, IFFCO has taken up a project in association with the Indian Space Research Organisation (ISRO) to use satellite-based remote sensing data and geographic information systems (GIS). Attention may be drawn to the fact that developed countries have been using precision farming using tools for a long time. Although it will take much time for our country because of small farms, it should be noted that the GIS a valuable role to play, even under existing conditions. Remote sensing and GIS can provide information on the evolution of warnings cultures highlights the strength of crops, etc. The project IFFCO-ISRO GIS extends support availability and rapid efficient fertilizers to farmers IFFCO much better & logistics operational efficiency. It seeks to provide farmers advisory body to provide decision support for farmers on land related issues, weather, etc. In addition to services based on GIS efforts are made to create databases that contain information of interest to farmers. These include the recommendation on the packaging practices of major cereals, pulses , horticulture, floriculture and animal husbandry, etc. Information on all inputs such as seeds, fertilizers, sources, availability, pricing, availability of credit, available alternatives and the conditions, etc. sought to be provided. A important service is intended to provide access to expert closest in times of stress or any other known problems in crops. The facility is sought to be provided to encourage social and agricultural experiments by forging forums cultures different. Many of the agricultural extension services are also proposed to be made online using aspects of multimedia. To encourage farmers to get the best price possible, information on various markets of agricultural production (Mantis) is also provided. The objective of this activity is to provide the status of different prices to facilitate farmer mandie increase its production to the mandi, where he can expect a better price. Other areas of interest to farmers such that distance learning new location, specific, etc. are also provided. Access to other sites of related interest such as those relating to courts, health, etc. are also required to be provided. 6. Analytical conclusions of explore the vast amount of agricultural data can be better supported by the appropriate application of data warehousing and OLAP technology. A data warehouse provides an effective structure and reliable storage of data while OLAP techniques in large quantities in place mechanisms to Analysis of these data. 7. References [1] data warehouse and its applications in agriculture, Anil Rai, Indian Agricultural Statistics Research Institute Library Avenue, New Delhi. Technology [2] Information in Agriculture, Mittal SC. [3] Data storage concepts, techniques, products and applications, CSR Prabhu.




