The LHC Computing Grid ProjectTechnical Design ReportLHCC, 29 June 2005 Jürgen Knobloch, IT Department, CERNThis file is available at:http://cern.ch/lcg/tdr/LCG_TDR.ppt
Technical Design Report - limitations • Computing is different from detector building • It’s ‘only’ software – can and will be adapted as needed • Technology evolves rapidly – we have no control • Prices go down – Moore’s law – Buy just in time – Understand startup • We are in the middle of planning the next phase • The Memorandum of Understanding (MoU) is being finalized • The list of Tier-2 centres is evolving • Baseline Services have been agreed • EGEE continuation is being discussed • Experience from Service Challenges will be incorporated • Some of the information is made available from (dynamic) Web-sites • The LCG TDR appears simultaneously with the experiments’ ones • Some inconsistencies may have passed undetected • Some people were occupied on both sides
The LCG Project • Approved by the CERN Council in September 2001 • Phase 1 (2001-2004):Development and prototyping a distributed production prototype at CERN and elsewhere that will be operated as a platform for the data challenges- leading to a Technical Design Report, which will serve as a basis for agreeing the relations between the distributed Grid nodes and their co-ordinated deployment and exploitation. • Phase 2 (2005-2007):Installation and operation of the full world-wide initial production Grid system, requiring continued manpower efforts and substantial material resources. • A Memorandum of Understanding • … has been developed defining the Worldwide LHC Computing Grid Collaboration of CERN as host lab and the major computing centres. • Defines the organizational structure for Phase 2 of the project.
Organizational Structure for Phase 2 Computing Resources Computing Resources LHC Committee LHC Committee – – LHCC LHCC Review Board Review Board - - C C - - RRB RRB Scientific Review Scientific Review Funding Agencies Funding Agencies Collaboration Board Collaboration Board – – CB CB Experiments and Regional Centres Experiments and Regional Centres Overview Board Overview Board - - OB OB Management Board Management Board - - MB MB Management of the Project Management of the Project Grid Deployment Board Grid Deployment Board Architects Forum Architects Forum Coordination of Coordination of Coordination of Coordination of Grid Operation Grid Operation Common Applications Common Applications
Cooperation with other projects • Network Services • LCG will be one of the most demanding applications of national research networks such as the pan-European backbone network, GÉANT • Grid Software • Globus, Condor and VDT have provided key components of the middleware used. Key members participate in OSG and EGEE • Enabling Grids for E-sciencE (EGEE) includes a substantial middleware activity. • Grid Operational Groupings • The majority of the resources used are made available as part of the EGEE Grid (~140 sites, 12,000 processors). EGEE also supports Core Infrastructure Centres and Regional Operations Centres. • The US LHC programmes contribute to and depend on the Open Science Grid (OSG). Formal relationship with LCG through US-Atlas and US-CMS computing projects. • The Nordic Data Grid Facility (NDGF) will begin operation in 2006. Prototype work is based on the NorduGrid middleware ARC.
The Hierarchical Model • Tier-0 at CERN • Record RAW data (1.25 GB/s ALICE) • Distribute second copy to Tier-1s • Calibrate and do first-pass reconstruction • Tier-1 centres (11 defined) • Manage permanent storage – RAW, simulated, processed • Capacity for reprocessing, bulk analysis • Tier-2 centres (>~ 100 identified) • Monte Carlo event simulation • End-user analysis • Tier-3 • Facilities at universities and laboratories • Access to data and processing in Tier-2s, Tier-1s • Outside the scope of the project
Tier-2s ~100 identified – number still growing
The Eventflow 50 days running in 2007107 seconds/year pp from 2008 on ~109 events/experiment106 seconds/year heavy ion
CPU Requirements Tier-2 Tier-1 58%pledged CERN
Disk Requirements Tier-2 Tier-1 54%pledged CERN
Tape Requirements Tier-1 CERN 75%pledged
Experiments’ Requirements • Single Virtual Organization (VO) across the Grid • Standard interfaces for Grid access to Storage Elements (SEs) and Computing Elements (CEs) • Need of a reliable Workload Management System (WMS) to efficiently exploit distributed resources. • Non-event data such as calibration and alignment data but also detector construction descriptions will be held in data bases • read/write access to central (Oracle) databases at Tier-0 and read access at Tier-1s with a local database cache at Tier-2s • Analysis scenarios and specific requirements are still evolving • Prototype work is in progress (ARDA) • Online requirements are outside of the scope of LCG, but there are connections: • Raw data transfer and buffering • Database management and data export • Some potential use of Event Filter Farms for offline processing
Architecture – Grid services • Storage Element • Mass Storage System (MSS) (CASTOR, Enstore, HPSS, dCache, etc.) • Storage Resource Manager (SRM) provides a common way to access MSS, independent of implementation • File Transfer Services (FTS) provided e.g. by GridFTP or srmCopy • Computing Element • Interface to local batch system e.g. Globus gatekeeper. • Accounting, status query, job monitoring • Virtual Organization Management • Virtual Organization Management Services (VOMS) • Authentication and authorization based on VOMS model. • Grid Catalogue Services • Mapping of Globally Unique Identifiers (GUID) to local file name • Hierarchical namespace, access control • Interoperability • EGEE and OSG both use the Virtual Data Toolkit (VDT) • Different implementations are hidden by common interfaces
Baseline Services Mandate The goal of the working group is to forge an agreement between the experiments and the LHC regional centres on the baseline services to be provided to support the computing models for the initial period of LHC running, which must therefore be in operation by September 2006. The services concerned are those that supplement the basic services for which there is already general agreement and understanding (e.g. provision of operating system services, local cluster scheduling, compilers, ..) and which are not already covered by other LCG groups such as the Tier-0/1 Networking Group or the 3D Project. … Members Experiments: ALICE: L. Betev, ATLAS: M. Branco, A. de Salvo, CMS: P. Elmer, S. Lacaprara, LHCb: P. Charpentier, A. Tsaragorodtsev Projects: ARDA: J. Andreeva, Apps Area: D. Düllmann, gLite: E. Laure Sites: F. Donno (It), A. Waananen (Nordic), S. Traylen (UK), R. Popescu, R. Pordes (US) Chair: I. Bird, Secretary: M. Schulz Timescale: 15 February to 17 June 2005
Architecture – Tier-0 WAN Gigabit Ethernet Ten Gigabit Ethernet Double ten gigabit Ethernet 10 Gb/s to 32×1 Gb/s 2.4 Tb/s CORE Experimental areas Campus network Distribution layer …. ..32.. ..10.. ..96.. ..96.. ..96.. ~2000 Tape and Disk servers ~6000 CPU servers x 8000 SPECINT2000 (2008)
Tier-0 components • Batch system (LSF) manage CPU resources • Shared file system (AFS) • Disk pool and mass storage (MSS) manager (CASTOR) • Extremely Large Fabric management system (ELFms) • Quattor – system administration – installation and configuration • LHC Era MONitoring (LEMON) system, server/client based • LHC-Era Automated Fabric (LEAF) – high-level commands to sets of nodes • CPU servers – ‘white boxes’, INTEL processors, (scientific) Linux • Disk Storage – Network Attached Storage (NAS) – mostly mirrored • Tape Storage – currently STK robots – future system under evaluation • Network – fast gigabit Ethernet switches connected to multigigabit backbone routers
Tier-0 -1 -2 Connectivity National Reasearch Networks (NRENs) at Tier-1s:ASnetLHCnet/ESnetGARRLHCnet/ESnetRENATERDFNSURFnet6NORDUnetRedIRISUKERNACANARIE
Technology - Middleware • Currently, the LCG-2 middleware is deployed in more than 100 sites • It originated from Condor, EDG, Globus, VDT, and other projects. • Will evolve now to include functionalities of the gLite middleware provided by the EGEE project which has just been made available. • In the TDR, we describe the basic functionality of LCG-2 middleware as well as the enhancements expected from gLite components. • Site services include security, the Computing Element (CE), the Storage Element (SE), Monitoring and Accounting Services – currently available both form LCG-2 and gLite. • VO services such as Workload Management System (WMS), File Catalogues, Information Services, File Transfer Services exist in both flavours (LCG-2 and gLite) maintaining close relations with VDT, Condor and Globus.
Technology – Fabric Technology • Moore’s law still holds for processors and disk storage • For CPU and disks we count a lot on the evolution of the consumer market • For processors we expect an increasing importance of 64-bit architectures and multicore chips • The cost break-even point between disk and tape store will not be reached for the initial LHC computing • Mass storage (tapes and robots) is still a computer centre item with computer centre pricing • It is too early to conclude on new tape drives and robots • Networking has seen a rapid evolution recently • Ten-gigabit Ethernet is now in the production environment • Wide-area networking can already now count on 10 Gb connections between Tier-0 and Tier-1s. This will move gradually to the Tier-1 – Tier-2 connections.
Core software libraries SEAL-ROOT merger Scripting: CINT, Python Mathematical libraries Fitting, MINUIT (in C++) Data management POOL: ROOT I/O for bulk dataRDBMS for metadata Conditions database – COOL Event simulation Event generators: generator library (GENSER) Detector simulation: GEANT4 (ATLAS, CMS, LHCb) Physics validation, compare GEANT4, FLUKA, test beam Software development infrastructure External libraries Software development and documentation tools Quality assurance and testing Project portal: Savannah Common Physics Applications
Prototypes • It is important that the hardware and software systems developed in the framework of LCG be exercised in more and more demanding challenges • Data Challenges have been recommended by the ‘Hoffmann Review’ of 2001. They have now been done by all experiments. Though the main goal was to validate the distributed computing model and to gradually build the computing systems, the results have been used for physics performance studies and for detector, trigger, and DAQ design. Limitations of the Grids have been identified and are being addressed. • Presently, a series of Service Challenges aim to realistic end-to-end testing of experiment use-cases over in extended period leading to stable production services. • The project ‘A Realisation of Distributed Analysis for LHC’ (ARDA) is developing end-to-end prototypes of distributed analysis systems using the EGEE middleware gLite for each of the LHC experiments.
Data Challenges • ALICE • PDC04 using AliEn services native or interfaced to LCG-Grid. 400,000 jobs run producing 40 TB of data for the Physics Performance Report. • PDC05: Event simulation, first-pass reconstruction, transmission to Tier-1 sites, second pass reconstruction (calibration and storage), analysis with PROOF – using Grid services from LCG SC3 and AliEn • ATLAS • Using tools and resources from LCG, NorduGrid, and Grid3 at 133 sites in 30 countries using over 10,000 processors where 235,000 jobs produced more than 30 TB of data using an automatic production system. • CMS • 100 TB simulated data reconstructed at a rate of 25 Hz, distributed to the Tier-1 sites and reprocessed there. • LHCb • LCG provided more than 50% of the capacity for the first data challenge 2004-2005. The production used the DIRAC system.
Service Challenges • A series of Service Challenges (SC) set out to successively approach the production needs of LHC • While SC1 did not meet the goal to transfer for 2 weeks continuously at a rate of 500 MB/s, SC2 did exceed the goal (500 MB/s) by sustaining throughput of 600 MB/s to 7 sites. • SC3 will start now, using gLite middleware components, with disk-to-disk throughput tests, 10 Gb networking of Tier-1s to CERN providing SRM (1.1) interface to managed storage at Tier-1s. The goal is to achieve 150 MB/s disk-to disk and 60 MB/s to managed tape. There will be also Tier-1 to Tier-2 transfer tests. • SC4 aims to demonstrate that all requirements from raw data taking to analysis can be met at least 6 months prior to data taking. The aggregate rate out of CERN is required to be 1.6 GB/s to tape at Tier-1s. • The Service Challenges will turn into production services for the experiments.
2005 2006 2007 2008 SC3 First physics cosmics First beams Full physics run SC4 LHC Service Operation Key dates for Service Preparation Sep05 - SC3 Service Phase May06 –SC4 Service Phase Sep06 – Initial LHC Service in stable operation Apr07 – LHC Service commissioned • SC3 – Reliable base service – most Tier-1s, some Tier-2s – basic experiment software chain – grid data throughput 1GB/sec, including mass storage 500 MB/sec (150 MB/sec & 60 MB/sec at Tier-1s) • SC4 – All Tier-1s, major Tier-2s – capable of supporting full experiment software chain inc. analysis – sustain nominal final grid data throughput (~ 1.5 GB/sec mass storage throughput) • LHC Service in Operation – September 2006 – ramp up to full operational capacity by April 2007 – capable of handling twice the nominal data throughput
ARDA- A Realisation of Distributed Analysis for LHC • Distributed analysis on the Grid is the most difficult and least defined topic • ARDA sets out to develop end-to-end analysis prototypes using the LCG-supported middleware. • ALICE uses the AliROOT framework based on PROOF. • ATLAS has used DIAL services with the gLite prototype as backend. • CMS has prototyped the ‘ARDA Support for CMS Analysis Processing’ (ASAP) that us used by several CMS physicists for daily analysis work. • LHCb has based its prototype on GANGA, a common project between ATLAS and LHCb.
Thanks to … • EDITORIAL BOARD • I. Bird, K. Bos, N. Brook, D. Duellmann, C. Eck, I. Fisk, D. Foster, B. Gibbard, C. Grandi, F. Grey, J. Harvey, A. Heiss, F. Hemmer, S. Jarp, R. Jones, D. Kelsey, J. Knobloch, M. Lamanna, H. Marten, P. Mato Vila, F. Ould-Saada, B. Panzer-Steindel, L. Perini, L. Robertson, Y. Schutz, U. Schwickerath, J. Shiers, T. Wenaus • Contributions from • J.P. Baud, E. Laure, C. Curran, G. Lee, A. Marchioro, A. Pace, and D. Yocum, A. Aimar, I. Antcheva, J. Apostolakis, G. Cosmo, O. Couet, M. Girone, M. Marino, L. Moneta, W. Pokorski, F. Rademakers, A. Ribon, S. Roiser, and R. Veenhof • Quality assurance by • The CERN Print Shop, F. Baud-Lavigne, S. Leech O’Neale, R. Mondardini, and C. Vanoli • … and the members of the Computing Groups of the LHC experiments who either directly contributed or have provided essential feed-back.