A: EEN is the short form for Engineered EndoNuclease. They are artificial endonucleases that are designed and constructed to target (i.e., bind and cleave) specific DNA sequences. EENs often contain protein domains for DNA recognition and DNA binding, and for DNA cleavage.
TALENs and ZFNs are the two most frequently used EENs. The DNA binding domains of these EENs are separated from the DNA cleavage domains / nuclease domains, and the binding domains consist of repeat units so that they are easy to engineer and construct.
In most places of EENdb, EENs refers to TALENs and ZFNs.
A: Yes, there are many other types of EEN; for example, engineered homing endonucleases (HEs), zinc finger domains fused with restriction enzymes. Generally speaking, TALENs and ZFNs are much easier for engineering than most of the other EENs.
New types of EENs may be developed in the future.
A: EENdb is a database collecting information of all the reported TALENs and ZFNs. Related and other ZFP domains, and some TAL effectors beside TALENs are also included. EENdb also collects other information about EEN engineering and other utilities, so as to provide a knowledge base of EEN.
Currently, EENdb does not collect other types of EENs. A database called LAHEDES has been developed for LAGLIDADG HE engineering by others.
A: We suggest referring to these FAQs: "1.5 Q: How do TALENs and ZFNs work?", "1.6 Q: What is the biggest difference between TALENs and ZFNs?", "2.1 Q: How can I know whether a gene has been targeted by TALENs/ZFNs reported by other researchers?" and all the FAQs listed under "Utilities & Engineering".
A: Usually, TALENs and ZFNs function as dimers (with a few exceptions: T/Z). Each monomer of EENs consists of a DNA binding domain, i.e. the repeat units from TALE (Transcription Activator-Like Effector) for TALEN or the zinc finger arrays from ZFP (Zinc Finger Protein) for ZFN, and a cleavage domain mainly derived from FokI. These two domains are connected by sequences such as the C-terminal region of TALE framework or a short ZFP-FokI linker peptide. The DNA binding domains consist of repetitive sequences so that they are easy to engineer and construct (T/Z).
Based on current studies, the RVD (Repeat Variable Di-residues) in each repeat units of TALEs and the 7-aa regions (positions -1 to +6 of the alpha-helixes) of zinc fingers, the repeat units of ZFPs, are key sequences for their DNA recognition and binding properties.
After the DNA binding domains recognize and bind to theor target sequences in an appropriate configuration, the FokI cleavage domains become dimerized and then cut the DNA to generate double-strand breaks (DSBs). Various operations can be effectively applied to the target genome via DSBs.
A: The biggest difference between TALENs/TALEs and ZFNs/ZFPs is their mechanisms of target recognition. The repeat units in TALEs show a one-repeat-one-nt relationship with their target nucleotides in a predictable manner, while each zinc finger in ZFPs generally targets three nucleotides (a triplets) but no accurate relationship between them has been ditermined.
Thus, screening steps are usually required in most cases of ZFN/ZFP construction. Nevertheless, TALE proteins are much bigger than ZFPs and special methods have to be developed for their construction.
A: There are four ways to search whether a particular gene has been targeted previously in the TALENs/ZFNs section of EENdb:
A: Please search in the referece search box. The PMIDs (PubMed IDs) and/or surnames of 1st authors of the references are acceptable. At most 10 IDs/names seperated by spaces are allowed. Other text inputted, such as "et al." and journal names will be ignored.
For more complex conditions such as searching by authors other than the first authors or searching by the journal names, we suggest to use PubMed to get the PMIDs before search in EENdb.
A: Yes. The search box of target sequence accept degenerate nucleotides or nucleotide groups in brackets. e.g., if sequence "TCRCCC" or "TC[AG]CCC" or "TC[GA]CCC" were searched, EENs with target sequence including "TCACCC" or "TCGCCC" (5'-end to 3'-end) in any strand will be found.
List of degenerate nucleotides: R=[AG], Y=[CT], M=>[AC], K=[GT], S=[CG], W=[AT], B=[CGT], D=[AGT], H=[ACT], V=[ACG], N=[ACGT].
A: No. Some of the information is ignored in the summary table, e.g., efficiency and specificity. Some other fields are also abbreviated to save space.
To access to the other or detailed information of the EEN record, click on the EENdb ID.
A: The EENdb IDs are composed of letters and numbers. An ID consists of the following parts:
Note: In detail pages, all EEN IDs with same prefix and number are listed together for comparison. In summary tables, the EEN records are sorted as a default by references (PMID), and can be changed to be sorted by IDs.
A: EEN records with a backgound color in red means this EEN is reported with no- or low-activity.
Records with darker backgound colors represent EEN groups or EEN with off-target sites. See the FAQ "2.5 Q: Why some IDs of EENs end with additional letters and digits".
A: Probably. Some original publications provided full sequences of the constructs of EEN (or ZFP domain, or TAL effector). Some did not provide them directly but the full sequence can be deduced from the known information and/or tracking down the history of related literatures. Unfortunately, full sequences of reported constructs are not readily available in other cases. EENdb tried to recover the key information of the EENs/ZFP domains/TAL effectors, i.e., the RVDs or 7-aa regions of DNA-binding domains, the name and length of frameworks and linker peptides, the FokI variants.
Currently, we have indicated the available full sequences for ZFNs recovered from the literatures in the detail pages of EENdb. Special links in the pages of TALEN/ZFN and Utilities are available to list all these ZFN records. For the full sequences of other ZFNs, we suggest users to contact the corresponding authors/groups of the relevant literatures to request the necessary information.
As for the TALENs/TAL effectors, most full sequences are available in the original references or can be deduced easily.
A: Uppercased nucleotides stand for left or right half-sites; the sequences in the strands recongnized by the TALE DNA-binding domains are underlined. Lowercased nucleotides stand for the spacer, or the additional nucleotides flanking at the 5' end of each half-site (position "0" or position "-1", generally a thymine).
Some references take the T at position 0 as part of the half-sites, EENdb unified and standardized them.
A: To save space, we replace the spacer sequences between the two half-sites with ellipses (...) for some target sites in the summary table of a TALEN list. The full sequence can be found in the detail page by clicking on the ID of the TALEN. In the summary table, numbers of nucleotides of the left half-site, the spacer (in brackets) and the right half-site are indicated under each sequences. The length of the spacer can be obtained from this information.
However, in the search result for target sequences, no sequences will be omitted in the summary table.
Unfortunately, some target sites are labeled as "(N/A)" both in the summary table and in the detail pages, since these sequences are not available from the original sources/publications.
A: It refer to the form of EEN. Most of the TALENs are work function as dimers, and the FokI cleavage domain is usually fused in the C-terminus of the TALE DNA-binding domain in the monomer so that the spacer is arranged at the 3'-end of the half-sites in the DNA target site. This type of TALENs is classified as the "T-D" (TALEN-Dimer) form. However, other forms of TALEN also exist, e.g., "T-D-N" means the FokI domain is fused in the N-terminus of the TALE binding in this type of TALEN. See details in the Utilities page of TALEN forms.
If no forms are indicated, it means this EEN belongs to the standard "T-D" form.
A: RVDs are the key sequences determining the DNA-binding properties of TALENs/TAL effectors. It shows a one-to-one relationship with its target nucleotides. Four RVDs are most widely used in TALE/TALEN-engineering and they are called the "standard" RVDs but other RVDs also exist.
"(std.)" means the RVDs used in this TALE binding domain are all "standard" ones. Some relatively common "non-standard" alternative RVDs and their related nucleotides are noted directly with different colors, such as "NK-G". "(others)" means other alternative RVDs .
A: The number in TALEN framework refers to the number of amino-acids residues in the C-terminal region of TALE DNA-binding domains; the rest indicates the origin of the framwork for the TALE DNA-binding domain.
Engineered TALENs use similar frameworks, but the length of N- and C-terminal sequences of the binding domains are variable. In addition, it was reported that the correlation between the length of C-terminal regions of TALE binding domains and the length of the spacer in the target site also influence the efficiency of TALENs (e.g., 21179091).
Note: Currently, the "+63" is the mostly used framework in TALENs.
A: We listed all the variants of the FokI cleavage domain in the Utilities page. The "-D" and "-R" represent the standardized names for FokI variants carrying single mutation R487D or D483R. Some publications use the name "-DD" and "-RR" for these two variants, instead of "-D" and "-R".
Here are some other examples of aliases of FokI variants used in different publications: "-Sharkey-DAMQS" and "-PEAS" for "-Sharkey-AS", "-Sharkey-RR" and "-PERR" for "-Sharkey-R".
A: Key information of the FokI variants (FokI), construction methods (Construc.), genome modification methods (Modif.) are listed in summary tables and detail pages. Click on corresponding links will indicate to the pages containing more relevant information in the section of Utilities.
In these Utilities pages, users can search all EENs using a particular FokI variants, or construction methods, or used in a particular genome modification methods.
A: No. It only means no information is available from the reference. Especially for the field of Specificity, few experiments have been carried out and reported.
The off-targets reported are listed in the same detail page of EEN.
A: Uppercased nucleotides stand for left or right half-sites; the sequences in the strands recognized and bound by the two ZFP DNA-binding domains are underlined. Lowercased nucleotides stand for the spacer, or the additional nucleotides flanking the 3' end of each half-site. Approximately, one zinc finger (or one 7-aa region) corresponds to a 3-nucleotide target (i.e., a triplets). However, one or two additional nucleotides sometimes exist between two triplets; in this case, they are also shown in lowercase.
Some references did not distinguish the triplets and additional nucleotides, EENdb unified and standardized them.
A: This piece of information indicates the length and configuration of ZFN target sites.
Approximately, one zinc finger (or one 7-aa region) corresponds to a 3-nucleotide target (or a triplet). These triplets are shown in uppercased letters. However, one or two more nucleotides sometimes exist between two triplets, they are indicated as lowercased letters and so are the nucleotides of the spacer. "(2F+1+3F)" means the left half-site consists of a 2 adjacent triplets and a 3 adjacent triplets, which are separated by 1 nucleotide. Therefore, the number of nucleotides in the left half-side is 3*2+1+3*3=16. "" presents the length (number of nucleotides) of the spacer and "4F" presents the right half-site consists of 4 triplets.
A: This information indicates the form of EEN form. Most of the ZFNs are function as dimers, and the FokI cleavage domain is usually fused in the C-terminus of the ZFP DNA-binding domain in the monomer, so that the spacers is arranged at the 5'-end of the half-sites in the DNA target site (ZFP DNA-binding domains bind their target sequences in an opposite direction, i.e., N- to C-terminus of the protein corresponds to 3'- to 5'-ends of the DNA sequence). This type of ZFNs is classified as the "Z-D" (ZFN-Dimer) form. However, other forms of ZFN also exist, e.g., "Z-S2" means a "Sandwich" form of ZFNs. See details in the Utilities page of ZFN forms.
If no forms are indicated, it means this EEN belongs tothe standard "Z-D" form.
A: They represent the DNA-binding domains of the ZFN. They are linked to relevant records in the section of ZFP Domain and more details can be found there.
A: The linker peptide of a ZFN is defined as the amino acids residues between (therefore links) the ZFP DNA-binding domain and the FokI cleavage domain.
It begins from the aa immediatedly after the histidine (H) residue of the last zinc finger and ends before the first glutamine (Q) residue of the FokI cleavage domain. The length of most of the linker peptides used in ZFNs is 4 or 17 aa.
A: The section of ZFP Domain collects the information of zinc fingers, especially the 7-aa regions, and their target sites extracted from all the reported ZFP DNA-binding domains of various types showing various activities, including ZFNs and other types of zinc finger-related proteins (e.g., transcriptional activators, repressors and methylases), or artificially constructed ZFPs only tested by binding assays.
The purpose to develop such a dataset is to facilitate the screening and construction of new functional ZFPs/ZFNs.
A: The section of TAL Effector collects the information of the TALE DNA-binding domain (mainly the tandem-repeat region of TALEs) from TALE proteins other than TALENs.
The purpose to develop such a dataset is not similar to that of ZFP Domain. The information of ZFP domains interrelates to that of ZFNs, and they link to each other, since the ZFP DNA-binding domains can be reused and reassembled into various ZFNs in some cases. However, the information of TAL effectors and TALENs are independent, for that TAL DNA-binding domains are generally single-assembly/use.
A: One zinc finger in a ZFP approximately target 3 nucleotides (a triplet), but it is also reported to interact with one additional nucleotide flanking the 3'-end of the triplet.
In non-strict searching mode of ZFP target sequences, users input sequences comprising 3n nt (n is a integer) and ZFPs targeting continuous triplets of these nucleotides will be found.
In strict searching mode, users input sequences comprising 3n+1 nt (n is a integer) and ZFPs targeting continuous triplets of the first 3n nucleotides and an additional nt of the last nucleotide queried will be found.
A: See the answer to same question in the FAQ section for EENs (both TALENs & ZFNs).
A: Both yes and no. EENdb provides methods and resources for constructing new EENs. But at present there are no rules to predict whether newly constructed EENs could work efficiently or not. We hope the collection of the information of all the reported EENs may offer some clues concerning this issue.
For ZFNs, a screening or pre-testing step for the binding activity of ZFP domains usually is required, and the ZFN cleavage activity needs to be determined by experiments as well. As to TALENs, since the one-to-one relationship of the RVDs and their target nucleotides is known, no screening step is necessary; however, the efficiency of a given pair of TALENs still can not be predicted and has to be determined by experiments.
A: For general application, TALENs are recommended, because they do not need screening steps before construction and have showed satisfactory success rates reported by others and based on our own experience. Many construction methods for customized TALENs have been reported and users can choose the one best fits their purpose.
The technology of ZFN was developed much earlier than TALEN; and in certain applications it is more mature, e.g., ZFNs are being tested in gene therapy trials, and are reported recently to be able to be delivered directly into ex vivo cultured cells in the form of purified proteins (Ref: 22751204). Some ZFNs have been carefully studied in their specificity and potential toxicity. While the toxicity and immunogenicity of TALENs still need further characterization.
A: Most of the newly reported EENs use similar frameworks/linker peptides, i.e., Zif268 for ZFNs and the +63 framework for TALENs. Users can follow the corresponding protocols.
The repeat units of TALENs, i.e., RVDs, are suggested to be used as the "standard" ones for most users. The repeat units of ZFNs, i.e., the 7-aa regions of each finger, must be screened or tested using appropriate construction strategies. Users should follow the details of each corresponding strategy.
For other factors of EEN engineering, see the FAQs below.
A: EENdb lists all the FokI variants that have been used in EEN engineering in the section of Utilities.
The variants were generated by introducing mutations into the wild-type FokI, for the purpose of improvement in three main aspects:
Users can choose the FokI variant which fits their purpose the best. The most frequently used variants might be those function as obligate heter-dimers, though, sometimes TALENs may already give satisfactory specificity by using the wild type (WT) FokI, since the target sequence for TALENs is usually much longer than that of ZFNs.
A: EENdb lists reported screening and/or construction strategies of ZFN and construction methods of TALEN in the section of Utilities, and relevant engineering resources, with comments on some of the advantages and disadvantages of each method. However, we suggest users to read the original references, protocols, and resources for further comparision.
A: EENdb lists reported genome modification methods in the section of Utilities.
If the goal is to simply mutate/disrupt a target gene in an organism or cultured cells, NHEJ-induced small indels might be enough. Usually, frame-shift mutations caused by indels can easily be identified. In this case, to disrupt the function of the protein product as much as possible, the EEN target site should normally be selected in the early exons of the target gene. However, alternative splicing and alternative start codon sometimes may compromise the mutantion effect. Distal deletions might be a better sulution to solve these problems.
HR/HDR leads to predictable and precise modifications of the genome, but a donor template is required in addition to the EENs and the HR efficiency is usually much lower than indel mutations. Genome modification through HR has only been reported/available in limited species.
A: The efficiency detection methods are listed and classfied in the section of Utilities.
A: Some resources provide methods to predict candidate off-target sites. However, there is no common and systematical methods to quantify or determine the difference of the off-target and the target site.
A: Please cite EENdb as:
Xiao A., Wu Y., Yang Z., Hu Y., Wang W., Zhang Y., Kong L., Gao G., Zhu Z., Lin S. and Zhang B. (2013) EENdb: a database and knowledge base of ZFNs and TALENs for endonuclease engineering. Nucl. Acids Res. 41(D1): D415-D422.
Other formats of citation (e.g., for EndNote or other reference managers) and PDF version of the article can be downloaded from the journal website.
A: Yes, the downloadable files are listed in the section of Help. The dataset of TALENs, ZFNs, ZFP domains and TAL effectors are in TSV (tab-seperated values) format, which can be opened by or copy-pasted to a spreadsheet soft such as MS Excel or analysis with other programs. A list of references of EENdb in XLSX (MS Excel 2007-2010) format is also avalable.
A: We are very glad to receive your corrections, additional information and any other advice. Please leave the comment in the end of each detail page if it is related to a specific record, or please contact us by e-mail firstname.lastname@example.org.
A: If you want to leave comments to EENdb, you can use the user comment system in the end of each detail page if it is related to a specific record, or in the end of this FAQ page if it is a general comment not concerned to any specific record.
Your nickname and e-mail address are required. The e-mail address is optional public to other users. If you choose to public, it will be shown with modifies (such as zfgenetics AT gmail DOT com) to avoid spams for you.
According to legal, governmental and/or server supplier's requirements, and in order to prevent spams, your comment might need to be reviewed by the administrator before public. Please be patient and do not submit the same content again, thank you very much. Old and archived comments might be deleted.
Alternatively, you can send your comments to us by e-mail email@example.com.
A: You are welcome to leave your questions in the end of this FAQ page or contact us by e-mail firstname.lastname@example.org.