1. What is Replication?
Ans. Replication is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency. Using replication, you can distribute data to different locations and to remote or mobile users over local and wide area networks, dial-up connections, wireless connections, and the Internet.
Transactional replication is typically used in server-to-server scenarios that require high throughput, including: improving scalability and availability; data warehousing and reporting; integrating data from multiple sites; integrating heterogeneous data; and offloading batch processing. Merge replication is primarily designed for mobile applications or distributed server applications that have possible data conflicts. Common scenarios include: exchanging data with mobile users; consumer point of sale (POS) applications; and integration of data from multiple sites. Snapshot replication is used to provide the initial data set for transactional and merge replication; it can also be used when complete refreshes of data are appropriate. With these three types of replication, SQL Server provides a powerful and flexible system for synchronizing data across your enterprise.
Transactional replication and peer-to-peer replication use the same architecture to move changes between the servers in a replication topology. The following illustration is an overview of the components involved in transactional replication.
Figure 2: Transactional replication architecture overview
A minimum of three server roles are required for transactional replication:
• Publisher, hosting the publication database
• Distributor, hosting the distribution database
• Subscriber, hosting the subscription database
Depending on the complexity of the replication topology, there may be multiple Subscriber servers or, in the case of peer-to-peer replication, multiple peer servers with the replication stream flowing in both directions between the peers. Furthermore, the roles of the various replication servers can be played by one server or by individual servers (the more common case), and it is possible for a server to play any combination of roles. Regardless, the various servers and databases must be protected to ensure that the replication stream is highly available.
Transactional replication relies on various agents to perform the tasks associated with tracking changes and distributing data. These agents are:
• Snapshot Agent, which runs at the Distributor. This agent prepares schema and initial data files of published tables and other objects, stores the snapshot files, and records information about synchronization in the distribution database.
• Log Reader Agent, which runs at the Distributor. This agent connects to the Publisher and moves transactions marked for replication from the transaction log of the publication database to the distribution database.
• Distribution Agent, which runs at the Distributor for push subscriptions, and at the Subscriber for pull subscriptions. This agent applies the (optional) initial snapshot to the Subscribers and moves transactions held in the distribution database to Subscribers
• Queue Reader Agent, which runs at the Distributor. This agent is only used for transactional replication with updateable subscriptions and moves changes made on the Subscribers back to the Publisher.
2. What is Peer to Peer replication in SQL server 2008?
Ans. Peer-to-peer replication provides a scale-out and high-availability solution by maintaining copies of data across multiple server instances, also referred to as nodes. Built on the foundation of transactional replication, peer-to-peer replication propagates transactionally consistent changes in near real-time. This enables applications that require scale-out of read operations to distribute the reads from clients across multiple nodes. Because data is maintained across the nodes in near real-time, peer-to-peer replication provides data redundancy, which increases the availability of data.
3. Considerations for configuring Peer to Peer replication in SQL server 2008.
Ans. General Considerations
· Peer-to-peer replication is available only in SQL Server 2008 Enterprise.
· All databases that participate in peer-to-peer replication should contain identical schema and data:
o Object names, object schema, and publication names should be identical.
o Publications must allow schema changes to be replicated. (This is a setting of 1 for the publication property replicate_ddl, which is the default setting. Row and column filtering are not supported.
· We recommend that each node use its own distribution database. This eliminates the potential of having a single point of failure.
· Tables and other objects cannot be included in multiple peer-to-peer publications in a single publication database.
· A publication must be enabled for peer-to-peer replication before any subscriptions are created.
· Subscriptions must be initialized by using a backup or with the 'replication support only' option. MS do not recommend the use of identity columns. When using identities, you must manually manage the ranges assigned to the tables at each participating database.
4. Development implications on Peer to Peer Environment
Ans.
Considerations for Schema Changes.
Schema changes are subject to any restrictions imposed by Transact-SQL. For example, ALTER TABLE does not allow you to ALTER primary key columns.
· If a publication is set to allow the propagation of schema changes, schema changes are propagated regardless of how the related schema option is set for an article in the publication. For example, if you select not to replicate foreign key constraints for a table article, but then issue an ALTER TABLE command that adds a foreign key to the table at the Publisher, the foreign key is added to the table at the Subscriber. To prevent this, disable the propagation of schema changes before issuing the ALTER TABLE command.
· If the schema change references objects or constraints existing on the Publisher but not on the Subscriber, the schema change will succeed on the Publisher but will fail on the Subscriber.
· All objects on the Subscriber that are referenced when adding a foreign key must have the same name and owner as the corresponding object on the Publisher.
· Explicitly adding, dropping, or altering indexes is not supported. Indexes created implicitly for constraints (such as a primary key constraint) are supported.
· Altering or dropping identity columns that are managed by replication is not supported.
· Schema changes that include nondeterministic functions are not supported because they can result in data at the Publisher and Subscriber being different (referred to as non-convergence). For example, if you issue the following command at the Publisher: ALTER TABLE SalesOrderDetail ADD OrderDate DATETIME DEFAULT GETDATE(), the values are different when the command is replicated to the Subscriber and executed.
· It is recommended that constraints be explicitly named. If a constraint is not explicitly named, SQL Server generates a name for the constraint, and these names will be different on the Publisher and each Subscriber. This can cause issues during the replication of schema changes. For example, if you drop a column at the Publisher and a dependent constraint is dropped, replication will attempt to drop the constraint at the Subscriber.
· To add a new column to a table and include that column in an existing publication, execute ALTER TABLE
ADD
. By default, the column is then replicated to all Subscribers. The column must allow NULL values or include a default constraint.
· To add a new column to a table and not include that column in an existing publication, disable the replication of schema changes, and then execute ALTER TABLE
ADD
.
· Adding an identity column to a published table is not supported, because it can result in non-convergence when the column is replicated to the Subscriber. The values in the identity column at the Publisher depend on the order in which the rows for the affected table are physically stored. The rows might be stored differently at the Subscriber; therefore the value for the identity column can be different for the same rows.
Dropping Columns
· To drop a column from an existing publication and drop the column from the table at the Publisher, execute ALTER TABLE
DROP
. By default, the column is then dropped from the table at all Subscribers.
· When dropping a column from a published article, take into consideration any constraints, indexes, or properties of the column that could affect the database. For example:
o You cannot drop columns used in a primary key from articles in transactional publications, because they are used by replication.
o Index changes are not propagated to Subscribers: if you drop a column at the Publisher and a dependent index is dropped, the index drop is not replicated. You should drop the index at the Subscriber before dropping the column at the Publisher, so that the column drop succeeds when it is replicated from the Publisher to the Subscriber. If synchronization fails because of an index at the Subscriber, manually drop the index and then rerun the Merge Agent.
o Constraints should be explicitly named to allow for dropping.