top of page

Search Results

52 items found for ""

  • Ph.D program at the Department of Biomedical Sciences, UNIL - University of Lausanne.

    Are you looking for a Ph.D. program that will challenge you and prepare you for a rewarding career in life science? Do you have a passion for stem cell and regenerative medicine? If yes, then you might be interested in the open position at the Habib Lab, at the Department of Biomedical Sciences, UNIL - University of Lausanne. The Habib Lab is a leading research group that uses tissue engineering to study and manipulate embryogenesis and adult tissue formation. They apply this knowledge to develop novel technologies that can promote tissue repair after injury. For example, they have created a bandage that can promote bone repair. As a Ph.D. student in the Habib Lab, you will have the opportunity to work on a project that explores the nutritional, metabolic and epigenetic requirements for the formation of engineered osteogenic tissues in the bones. You will use cutting-edge tools such as RNASeq, gene editing tools and advanced imaging techniques to understand the specific pathways that regulate tissue formation. You will also design new approaches to target the metabolic pathways in the bone stem cell niche for tissue repair. To apply for this position, you should have a MSc degree in a biological/bioengineering sciences or related disciplines, be highly motivated and have an interest in stem cell biology and bioengineering. You should also submit your CV, publication list and a covering letter that summarizes your previous research experience and explains how you meet the criteria. The expected start date of the Ph.D. program is 01.12.2023. This is a rare opportunity to join a world-class research team and pursue a Ph.D. in stem cell and regenerative medicine. You will learn from experts, use state-of-the-art techniques and contribute to the advancement of life science. Don’t miss this chance to apply for the open position at the Habib Lab. Click on the link below to submit your application. Good luck!

  • ICSURE-2024: An International Conference on Frontiers of Sustainable Research.

    Are you interested in sustainable research? Do you want to learn from experts, network with peers, and present your work in a global platform? If yes, then you should not miss the International Conference on Frontiers of Sustainable Research in Health Care, Food, Agriculture, and Environmental Management (ICSURE-2024). ICSURE-2024 is a two-day conference organized by Guru Nanak Centre for Research (GNCR), Guru Nanak College (Autonomous), Chennai, in association with G.S. Gill Research Institute. The conference aims to bring together researchers, practitioners, policy makers, and students from various disciplines and sectors to share their knowledge, experience, and insights on sustainable research. The conference will cover a wide range of topics, such as: Sustainable Research in Health Care Sustainable food production and consumption Sustainable agriculture and rural development Sustainable environmental management and conservation Food, Agriculture, and Environmental Management Sustainable development goals and indicators The conference will feature keynote speeches, plenary sessions, panel discussions, oral presentations, poster presentations, and workshops. The conference will also provide opportunities for networking, collaboration, and publication. The conference will be held in a hybrid mode, which means that you can participate either online or offline. The online mode will use a web-based platform that will allow you to access the conference sessions, interact with the speakers and attendees, and submit your feedback. The offline mode will use a physical venue that will host the conference sessions, provide the necessary facilities. The conference is open to anyone who is interested in sustainable research, regardless of their academic background, professional experience, or geographical location. The conference is especially suitable for: Researchers and scholars who want to present their work and get feedback from peers and experts Practitioners and policy makers who want to learn from best practices and case studies and apply them to their own contexts Students and educators who want to enhance their knowledge and skills and explore career opportunities in sustainable research Industry and civil society representatives who want to showcase their products and services and collaborate with potential partners If you are one of them, then you should register for the conference as soon as possible. The registration fee is very affordable and includes access to all the conference sessions, materials, and certificates. The registration fee is as follows: Students : Participants - 500/-, Presentation (Oral/Poster) Rs.1000/- Scholars : Participants - Rs 750/-, Presentation (Oral/Poster)- Rs.1500/- Faculty Members : Participants & Presentation (Oral/Poster)- Rs.1500/- Industry : Participants - Rs.2000/-, Presentation (Oral/Poster)- Rs.3000/- International : Participants- 100 USD, Presentation (Oral/Poster)- 150 USD You can pay the registration fee through NEFT Transfer or G PAY to the following account details: Beneficiary Account Name: GNES CENTRE FOR CONSULTANCY AND OUTREACH, A/C No: 100047544447; Bank Name : EQUITAS SMALL FINANCE BANK LTD., Branch : Velachery, Chennai, IFSC : ESFB0001004 To register for the conference, you need to fill out an online or offline form and submit your abstract or full paper, if you want to present your work. The deadlines for submission are as follows: Abstract Submission Last date: 01.12.2023 Full Paper Submission Last Date: 31.01.2024 Registration Closing Date: 15.02.2024 You can find the online and offline registration forms here: Online: [https://forms.gle/5rmBqeuacrJ2aA9x6] Offline: [https://forms.gle/NnUbMhm4VexkTuvH6]

  • Junior Research Fellow Position at University of Madras

    Project Location: You'll be based at the Department of Genetics, University of Madras. Principal Investigator: The project is led by Dr. B Anandan, Assistant Professor in the Department of Genetics. Funded Project: This research project is generously funded by DST-SERB CRG. Duration: Initially, the position is for one year, with the possibility of extension for up to three years based on your performance. Project Title: The project involves characterizing the neuroprotective effects of Taraxerol and Taraxasteriol against neurotoxin-induced cellular and animal models for Parkinson's disease. Qualifications: To be considered for this opportunity, you should hold an MSc in Biomedical Genetics or related life science branches, such as Biochemistry, Biotechnology, or Molecular Biology. Desirable Qualifications: Research experience in various techniques, including tissue culture, nucleic acid extraction, real-time PCR, ELISA, western blotting, zebrafish handling, and data analysis is highly desirable. Experience in methods, tools, and techniques related to genetics research is also a plus. Financial Assistance: Qualified candidates who have passed national-level exams are eligible for financial assistance, which includes a monthly stipend of ₹31,000 plus a 24% House Rent Allowance (HRA). Application Process: To apply, complete the application form and send it to anand_gem@yahoo.com. Don't miss this opportunity to contribute to cutting-edge genetics research at the University of Madras. Apply by the deadline on 3rd November 2023.

  • Running a Windows 10 (x64) virtual machine inside the Ubuntu 23.10

    In this article we are going to discuss how to install windows 10 virtual machine inside Ubuntu We will be looking at the details of running a guest windows operating system inside the Linux operating system. The software that we are going to use to run the guest windows operating system in Linux is a freely available software called Virtual Box. The Virtual Box can be downloaded from the software page by visiting Oracle VM VirtualBox. Virtual Box is a virtualization software that divides the computing resources between the two operating systems. Virtual Box is a powerful virtualization software that currently can be used on Windows, Linux and Solaris hosts, and support a vast array of guest operating systems. The Virtual Box is being maintained by the Oracle company, that ensures that the product meets the professional quality. Utility-wise, Virtual Box is a resource-intensive that requires at least a dual-core-CPU, and 8 GB RAM, because these resources are going to be split between the two operating systems. Install Virtual Box Let’s install virtual box. After visiting the software page, download the Virtual Box software. When you click the download button, it will open a windows to chose the host operating system that we are virtualizing. Under the VirtualBox platform packages, choose the Linux distributions. Download the appropriate VirtualBox from the downloads page. Once the the VirtualBox has been downloaded, go to the folder containing the installation file. You may notice that this is a .deb file. In order to install the .deb file, you need to install the GDebi tool. GDebi tool helps you install the local deb packages. GDebi does not come with Ubuntu by default, and can be installed via Terminal command. sudo apt install gdebi -y Once the GDebi package Installer is available, as a GUI, open the .deb file with GDebi. Once the installation is successfully finished, the Oracle VM VirtualBox Manager window appears. Installing VM VirtualBox extension Pack This VirtualBox extension pack can be installed to extend the functionality of the Oracle VM VirtualBox base package. Installing the VM VirtualBox extension pack can help us use USB 2.0 and/ or USB 3.0 on the Virtual Machine. This is a very useful stuff to have. When you download the VM VirtualBox extension pack, it should be the same version as the VirtualBox. It can be downloaded by clicking the downloads link. Once the extension pack has been downloaded, click Install, and choose the extension pack file, downloaded. Once the installation of the extension package is completed, you can see the installed package, listed with a green tick by the side of it. Checking if virtualization is enabled/ disabled in the BIOS It is important to check whether our computer’s CPU supports virtualization. This can be directly verified in the BIOS settings. In older machines, Under the advanced BIOS features, one can enable the Virtualization feature, if it is seen disabled. In some other BIOS versions, you may go to the advanced features and turn the Intel Virtualization Technology enabled. If you are a Linux user, this can be checked on the command line. First of all, install the cpu-checker. $sudo apt-get update $sudo apt-get install cpu-checker Once it is installed, enter the following command. $kvm-ok It it return the following message, then the virtualization is supported. INFO: /dev/kvm exists KVM acceleration can be used Otherwise, if the message that appears is as follows, then it means the Virtualization is disabled by the BIOS. INFO: /dev/kvm does not exist HINT: sudo modprobe kvm_intel INFO: Your CPU supports KVM extensions INFO: KVM (vmx) is disabled by your BIOS HINT: Enter your BIOS setup and enable Virtualization Technology (VT), and then hard poweroff/poweron your system KVM acceleration can NOT be used Installing the guest operating system Now, we need to install the guest windows operating system. The windows operating system can be downloaded as an ISO file. Once the ISO file for the guest operating system is downloaded, it is time for creating a new virtual machine. We need to click the “New” option under the “Machine” tab. In the Create Virtual Machine box, enter the name. If you are going to create a virtual machine for Windows 10, then you should name it as Windows 10. Then you need to select the ISO file for the guest operating system; in this case Windows 10. Once all the information entered, click the “Next” button. At this step, it will ask whether Windows needs to be installed unattendedly or not. If the guest Windows is installed Unattendedly, your user won’t be added to the sudo group. Modifying Virtual Machine’s Hardware Now, we need to modify Virtual Machine’s hardware. You may modify the hardware allocation for both Virtual Machine and Processors. Next, we can add a virtual hard disk to the new machine. You can create a new virtual hard disk or use an existing virtual hard disk. It is also possible to create a virtual machine without a virtual hard disk. In the create a virtual Machine box, a summary of the Virtual Machine appears as follows. It features the name of the guest OS, the path to the Virtual Machine, Base memory and the Processor allocation and the size-allocation of the disk. Here we are! Now the virtual machine has been created. Now double click the Virtual Machine, and you can see the installation of the guest OS begin. Once the guest operating system is installed, you will see Windows - Oracle VM VirtualBox. You can toggle between Fullscreen view and compact view, by pressing the Ctrl+F key combination, or by selecting the Fullscreen option under the View tab. Clipboard and Folder Sharing Let’s see how folder sharing can be performed. Before that let’s shutdown the Windows virtual machine, exactly in the same way how a normally-installed Windows is shutdown. Once it is shutdown, you can notice a label “Power Off” on the Windows VM, in the Virtual Box. Now click the Settings button, and it opens the Settings window. Let’s see how clipboard sharing can be set to bidirectional, so that texts copied can be transferred bidirectionally. Once the Windows 10 settings window opens, under the general tab, click the Advanced tab. Turn the Shared Clipboard option to Bidirectional. There are also options: Host to Guest, Guest to Host, and Disabled. Now, let’s share folders. Go to the Shared Folders. And, you see the following window. On this window, you can see a small blue folder with a green plus icon. Click it. And, it opens a small window, titled: “Add Share”. Here, you can choose the path to the folder to be shared. Once the folder path has been selected, click the Auto Mount option. Click OK. Once back in the Virtual Box Manager window, click on the Insert Guest Additions CD Image. This will create a virtual optical disk. In the file explorer, click the optical disk drive, and it opens a folder. Select the VBoxWindowsAdditions-amd64.exe file and execute it. Once the execution is completed, restart the machine, to enable access to the shared folder.

  • Job Opportunity: Project Assistant Position at the University of Madras

    Location of Employment: Center of Advanced Study in Crystallography and Biophysics, University of Madras. Number of Posts Available: Two positions are open. Project Details: Scheme: National Network Project (NNP) Project Title: "Multi-targeted Lead Identification for Metabolic Disorders (Diabesity) through in-silico and biochemical characterization" Qualifications Required: Prospective candidates should have an M.Sc. in Biophysics, Biochemistry, Biotechnology, Microbiology, Molecular Biology, or M.Tech in Biotechnology/Genetic Engineering. Additionally, experience in cloning, recombinant protein expression, and purification is highly desirable. Project Duration: The project spans over 3 years. Emoluments: Successful candidates can expect an attractive monthly salary of Rs. 24,800, which includes Rs. 20,000 as basic pay and 24% House Rent Allowance (HRA). How to Apply: To apply, submit your Biodata, degree certificates, work experience certificates, and any other relevant data (if applicable) by emailing them to nnpdbtunom@gmail.com. Application Deadline: Make sure to apply before the deadline on 07.11.2023. Seize this opportunity to contribute to an exciting research project and advance your career. Apply now!

  • Academic Position: Junior Research Fellow, at the Dept. of Pharmacy, BITS Pilani, Hyderabad

    Purpose of the JRF Posting The Junior Research Fellow (JRF) position is established to contribute to an ICMR (Indian Council of Medical Research) project. The primary objective of this position is to engage in research and assist in the project's activities related to pharmacological evaluation and characterization of a novel SARM1 inhibitor in a pre-clinical model of Charcot-Marie-Tooth 2A (CMT2A). The JRF will work under the guidance of the project's supervisors, Dr. Abhijeet Joshi and Prof. Punna Rao Ravi. Number of Positions Two positions for Junior Research Fellow (JRF) have been called for this academic opportunity. Work Location The JRF positions are based in the Department of Pharmacy at BITS Pilani, Hyderabad Campus. Supervisors Dr. Abhijeet Joshi (for all Pharmacology candidates) Email: abhijeet.j@hyderabad.bits-pilani.ac.in Prof. Punna Rao Ravi (for all Pharmaceutics candidates) Email: rpunnarao@hyderabad.bits-pilani.ac.in Title of the Project The project's title is "Pharmacological evaluation and Characterization of a novel SARM1 inhibitor in a pre-clinical model of Charcot-Marie-Tooth 2A (CMT2A)." Fellowship Amount The selected Junior Research Fellows will receive a monthly fellowship amount of ₹28,000 along with a House Rent Allowance (HRA) of ₹7,560, resulting in a total monthly stipend of ₹35,560. Essential Eligibility for Position-1 (Pharmacology) Educational Qualification: M.Pharm (Pharmacology) or MSc (Life Science) Desirable Qualifications: Strong interest in Neuroscience (Independent or Self-motivated) Experience in Animal (Mice) handling is a MUST Experience in basic techniques like cell culture, RT-PCR, western blot, and basic biochemical assays. Essential Eligibility for Position-2 (Pharmaceutics) Educational Qualification: M.Pharm (Pharmaceutics) Desirable Qualifications: Strong interest in Drug Delivery Systems (Nose-to-Brain delivery) and Pharmacokinetics. Experience in developing strong formulations like polymeric nanoparticles/lipid nanoparticles. Experience in developing analytical and bio-analytical methods using HPLC is a must. Experience in handling lab animals for conducting pharmacokinetic studies (dosing, drawing blood, etc). How to Apply Interested candidates should send their CV (Curriculum Vitae) to the respective supervisors based on their qualifications and research interests. Particulars of CV to be Sent The CV should provide comprehensive information on educational qualifications, research/work experience, and any published research papers (if applicable). Address to Which CV Should Be Sent Dr. Abhijeet Joshi (for all Pharmacology candidates) Email: abhijeet.j@hyderabad.bits-pilani.ac.in Prof. Punna Rao Ravi (for all Pharmaceutics candidates) Email: rpunnarao@hyderabad.bits-pilani.ac.in Deadline The deadline for submitting applications is November 5, 2023. Interested candidates are encouraged to apply before this date to be considered for the Junior Research Fellow positions.

  • Welcome to the exciting opportunity to apply for MSc scholarships through the RUFORUM-GRA Graduate R

    The SMARTGRAZE project is a research initiative that aims to assess the potential of holistic planned grazing (HPG) as a climate change mitigation and adaptation strategy in semi-arid rangelands of Kenya. The project is funded by the Regional Universities Forum for Capacity Building in Agriculture (RUFORUM) and the German Academic Exchange Service (DAAD), and is implemented by the University of Nairobi in collaboration with other partners. The project seeks to address the challenges of climate change and land degradation in semi-arid rangelands, which are characterized by low and erratic rainfall, high temperatures, and fragile ecosystems. These regions are home to millions of people who depend on livestock production for their livelihoods, but are also vulnerable to the impacts of climate change, such as droughts, floods, and land degradation. The SMARTGRAZE project aims to assess the potential of HPG as a sustainable land management practice that can enhance soil organic carbon (SOC) stocks and fluxes, reduce greenhouse gas (GHG) emissions, and improve vegetation and soil properties in semi-arid rangelands. HPG is a grazing management system that involves the strategic movement of livestock across different grazing areas, based on the principles of ecosystem health, biodiversity, and productivity. The project will collect data on SOC stocks and fluxes, GHG emissions, vegetation, and soil properties under HPG and traditional grazing systems during wet and dry seasons. The data will be analyzed using various methods, including static chambers, GIS, and the Roth-C carbon model. The project will also assess the effects of HPG on livestock forage selectivity, nutritional quality, and productivity. The project is offering MSc scholarships to qualified applicants who are interested in conducting research on the effects of HPG on soil and vegetation properties, livestock forage selectivity, and nutritional quality in the rangelands of Kajiado County, Kenya. The scholarships cover tuition fees, monthly stipends, and research costs for two years. The findings of the SMARTGRAZE project will provide valuable insights into the potential of HPG as a sustainable land management practice in semi-arid rangelands. The project aims to contribute to the development of evidence-based policies and practices that promote climate change mitigation and adaptation in the region. The project also aims to enhance the capacity of local researchers and practitioners to conduct research and implement sustainable land management practices in semi-arid rangelands. Overall, the SMARTGRAZE project is an important initiative that seeks to address the challenges of climate change and land degradation in semi-arid What are the requirements for this program? The SMARTGRAZE project is offering MSc scholarships to qualified applicants who are interested in conducting research on the effects of holistic planned grazing (HPG) on soil and vegetation properties, livestock forage selectivity, and nutritional quality in the rangelands of Kajiado County, Kenya. Here are the requirements for this program, as outlined in the PDF file: 1. Excellent BSc. degree (1st class or upper 2nd class honors) in natural resource management, range management, Agroecosystem and environment management, human, agricultural sciences, animal production, or related fields, from recognized/accredited University. 2. In-depth knowledge of arid and semi-arid rangeland ecosystems and extensive livestock production systems 3. Fluency in spoken and written English 4. Teamwork orientation and good communication skills. To apply for the MSc scholarships, interested applicants should choose the research topic of interest and send their application to the undersigned. The application should include a letter of motivation related to the research topic chosen (1 page), curriculum vitae, summary of the undergraduate special project/field attachment report topic (1 page), copy of relevant certificates and transcripts, and names and contact details of 2 referees. The application should be submitted by email in a single PDF file.

  • Creating a Custom YAML Dumper and Representer Function in Python

    The complex data that pyYAML finds harder to serialize This is a complex data data = [{0.627: -47.57142857142857, 0.66: -35.76190476190476, 0.6930000000000001: -40.61904761904761, 0.726: -50.33333333333332, 0.759: -61.66666666666664, 0.792: -71.38095238095235, 0.8250000000000001: -76.23809523809521, 0.8580000000000001: -72.99999999999997, 0.891: -73.19047619047616, 0.924: -72.90476190476188, 0.9570000000000001: -72.33333333333331, 0.99: -71.66666666666664, 1.0230000000000001: -71.09523809523807, 1.056: -70.8095238095238, 1.089: -70.99999999999999, 1.122: -71.47619047619047, 1.155: -70.76190476190474, 1.1880000000000002: -69.33333333333331, 1.221: -67.66666666666664, 1.254: -66.23809523809523, 1.2870000000000001: -65.5238095238095, 1.32: -66.47619047619045, 1.353: -65.76190476190474, 1.3860000000000001: -64.33333333333331, 1.419: -62.66666666666665, 1.452: -61.23809523809522, 1.485: -60.52380952380951, 1.518: -60.99999999999998, 1.5510000000000002: -61.95238095238093, 1.584: -60.5238095238095, 1.617: -57.66666666666665, 1.6500000000000001: -54.33333333333332, 1.683: -51.47619047619046, 1.7160000000000002: -50.04761904761903, 1.749: -50.99999999999999, 1.782: -52.14285714285714, 1.8150000000000002: -50.42857142857143, 1.848: -47.0, 1.881: -43.0}, {5.577: -43.99999999999998, 5.61: -66.99999999999997, 5.643000000000001: -86.71428571428568, 5.676: -96.57142857142853, 5.7090000000000005: -89.99999999999997, 5.742: -89.99999999999997, 5.775: -89.99999999999997, 5.808: -89.99999999999997, 5.841: -89.99999999999997, 5.8740000000000006: -89.99999999999997, 5.907: -89.99999999999997, 5.94: -90.19047619047618, 5.973: -89.9047619047619, 6.006: -89.33333333333331, 6.039000000000001: -88.66666666666666, 6.072: -88.09523809523807, 6.105: -87.8095238095238, 6.138: -88.0, 6.171: -88.0, 6.204000000000001: -88.0, 6.237: -88.0, 6.2700000000000005: -88.0, 6.303: -88.0, 6.336: -88.0, 6.369000000000001: -88.0, 6.402: -88.0, 6.4350000000000005: -88.0, 6.468: -88.0, 6.501: -88.0, 6.534000000000001: -88.0, 6.567: -88.0, 6.6000000000000005: -88.19047619047618, 6.633: -87.9047619047619, 6.666: -87.33333333333331, 6.699000000000001: -86.66666666666666, 6.732: -86.09523809523807, 6.765000000000001: -85.8095238095238, 6.798: -86.0, 6.831: -87.8095238095238, 6.864000000000001: -85.09523809523809, 6.897: -79.66666666666666, 6.930000000000001: -73.33333333333331, 6.963: -67.9047619047619, 6.996: -65.19047619047618, 7.029: -66.99999999999999, 7.062: -74.8095238095238, 7.095000000000001: -63.09523809523809}] This is a list object containing a sequence of dictionary objects. We need to serialize this complex data, by writing it into a yaml file. Okay, so if you have gone through my previous tutorial on yaml, I have clearly provided the solution to it. import yaml with open(r”dataSerialized.yaml”, “w”) as wFile: yaml.dump(data, wFile, default_flow_style=False) Here, default_flow_style can be set to True or False, depending upon the data structure. If it is set to “False”, the data will be arranged in blocks, otherwise, they will be in block style. Let's look at this complex data, a bit closer. import yaml listVar = [{0.627: -47.57142857142857, 0.66: -35.76190476190476, 0.6930000000000001: -40.61904761904761, 0.726: -50.33333333333332, 0.759: -61.66666666666664, 0.792: -71.38095238095235, 0.8250000000000001: -76.23809523809521, 0.8580000000000001: -72.99999999999997, 0.891: -73.19047619047616, 0.924: -72.90476190476188, 0.9570000000000001: -72.33333333333331, 0.99: -71.66666666666664, 1.0230000000000001: -71.09523809523807, 1.056: -70.8095238095238, 1.089: -70.99999999999999, 1.122: -71.47619047619047, 1.155: -70.76190476190474, 1.1880000000000002: -69.33333333333331, 1.221: -67.66666666666664, 1.254: -66.23809523809523, 1.2870000000000001: -65.5238095238095, 1.32: -66.47619047619045, 1.353: -65.76190476190474, 1.3860000000000001: -64.33333333333331, 1.419: -62.66666666666665, 1.452: -61.23809523809522, 1.485: -60.52380952380951, 1.518: -60.99999999999998, 1.5510000000000002: -61.95238095238093, 1.584: -60.5238095238095, 1.617: -57.66666666666665, 1.6500000000000001: -54.33333333333332, 1.683: -51.47619047619046, 1.7160000000000002: -50.04761904761903, 1.749: -50.99999999999999, 1.782: -52.14285714285714, 1.8150000000000002: -50.42857142857143, 1.848: -47.0, 1.881: -43.0}, {5.577: -43.99999999999998, 5.61: -66.99999999999997, 5.643000000000001: -86.71428571428568, 5.676: -96.57142857142853, 5.7090000000000005: -89.99999999999997, 5.742: -89.99999999999997, 5.775: -89.99999999999997, 5.808: -89.99999999999997, 5.841: -89.99999999999997, 5.8740000000000006: -89.99999999999997, 5.907: -89.99999999999997, 5.94: -90.19047619047618, 5.973: -89.9047619047619, 6.006: -89.33333333333331, 6.039000000000001: -88.66666666666666, 6.072: -88.09523809523807, 6.105: -87.8095238095238, 6.138: -88.0, 6.171: -88.0, 6.204000000000001: -88.0, 6.237: -88.0, 6.2700000000000005: -88.0, 6.303: -88.0, 6.336: -88.0, 6.369000000000001: -88.0, 6.402: -88.0, 6.4350000000000005: -88.0, 6.468: -88.0, 6.501: -88.0, 6.534000000000001: -88.0, 6.567: -88.0, 6.6000000000000005: -88.19047619047618, 6.633: -87.9047619047619, 6.666: -87.33333333333331, 6.699000000000001: -86.66666666666666, 6.732: -86.09523809523807, 6.765000000000001: -85.8095238095238, 6.798: -86.0, 6.831: -87.8095238095238, 6.864000000000001: -85.09523809523809, 6.897: -79.66666666666666, 6.930000000000001: -73.33333333333331, 6.963: -67.9047619047619, 6.996: -65.19047619047618, 7.029: -66.99999999999999, 7.062: -74.8095238095238, 7.095000000000001: -63.09523809523809}] #Let us check the data type print(f"\nThis is a {type(listVar)} data type") dictItem = listVar[0] #Let us check what is inside the list print(f"\nThis is a {type(listVar)} data type that contains {type(dictItem)} data type") numItem = list(dictItem.keys())[0] valueItem = list(dictItem.values())[0] print(f"""\nThis is a {type(listVar)} data type that contains {type(dictItem)} data type and the keys are {type(numItem)} data type""") print(f"""\nThis is a {type(listVar)} data type that contains {type(dictItem)} data type and the values are {type(valueItem)} data type""") It shows the following output. This is a data type This is a data type that contains data type This is a data type that contains data type and the keys are data type This is a data type that contains data type and the values are data type Now let’s write the list (the complex data) to a yaml file. with open(r"D:\website\wixSite\articles\yaml\testYaml.yaml", "w")as wFile: yaml.dump(listVar, wFile) The output is as you see below. - 0.627: -47.57142857142857 0.66: -35.76190476190476 0.6930000000000001: -40.61904761904761 0.726: -50.33333333333332 0.759: -61.66666666666664 0.792: -71.38095238095235 0.8250000000000001: -76.23809523809521 0.8580000000000001: -72.99999999999997 0.891: -73.19047619047616 0.924: -72.90476190476188 0.9570000000000001: -72.33333333333331 0.99: -71.66666666666664 1.0230000000000001: -71.09523809523807 1.056: -70.8095238095238 1.089: -70.99999999999999 1.122: -71.47619047619047 1.155: -70.76190476190474 1.1880000000000002: -69.33333333333331 .............................. ............................... What if the complex data contains data types that yaml can’t support? Let’s tweak the above complex data for this tutorial purpose. For example, let’s change the type of key and values from “float” to “np.float64” We have written the following code. import yaml import numpy as np listVar = [{0.627: -47.57142857142857, 0.66: -35.76190476190476, 0.6930000000000001: -40.61904761904761, 0.726: -50.33333333333332, 0.759: -61.66666666666664, 0.792: -71.38095238095235, 0.8250000000000001: -76.23809523809521, 0.8580000000000001: -72.99999999999997, 0.891: -73.19047619047616, 0.924: -72.90476190476188, 0.9570000000000001: -72.33333333333331, 0.99: -71.66666666666664, 1.0230000000000001: -71.09523809523807, 1.056: -70.8095238095238, 1.089: -70.99999999999999, 1.122: -71.47619047619047, 1.155: -70.76190476190474, 1.1880000000000002: -69.33333333333331, 1.221: -67.66666666666664, 1.254: -66.23809523809523, 1.2870000000000001: -65.5238095238095, 1.32: -66.47619047619045, 1.353: -65.76190476190474, 1.3860000000000001: -64.33333333333331, 1.419: -62.66666666666665, 1.452: -61.23809523809522, 1.485: -60.52380952380951, 1.518: -60.99999999999998, 1.5510000000000002: -61.95238095238093, 1.584: -60.5238095238095, 1.617: -57.66666666666665, 1.6500000000000001: -54.33333333333332, 1.683: -51.47619047619046, 1.7160000000000002: -50.04761904761903, 1.749: -50.99999999999999, 1.782: -52.14285714285714, 1.8150000000000002: -50.42857142857143, 1.848: -47.0, 1.881: -43.0}, {5.577: -43.99999999999998, 5.61: -66.99999999999997, 5.643000000000001: -86.71428571428568, 5.676: -96.57142857142853, 5.7090000000000005: -89.99999999999997, 5.742: -89.99999999999997, 5.775: -89.99999999999997, 5.808: -89.99999999999997, 5.841: -89.99999999999997, 5.8740000000000006: -89.99999999999997, 5.907: -89.99999999999997, 5.94: -90.19047619047618, 5.973: -89.9047619047619, 6.006: -89.33333333333331, 6.039000000000001: -88.66666666666666, 6.072: -88.09523809523807, 6.105: -87.8095238095238, 6.138: -88.0, 6.171: -88.0, 6.204000000000001: -88.0, 6.237: -88.0, 6.2700000000000005: -88.0, 6.303: -88.0, 6.336: -88.0, 6.369000000000001: -88.0, 6.402: -88.0, 6.4350000000000005: -88.0, 6.468: -88.0, 6.501: -88.0, 6.534000000000001: -88.0, 6.567: -88.0, 6.6000000000000005: -88.19047619047618, 6.633: -87.9047619047619, 6.666: -87.33333333333331, 6.699000000000001: -86.66666666666666, 6.732: -86.09523809523807, 6.765000000000001: -85.8095238095238, 6.798: -86.0, 6.831: -87.8095238095238, 6.864000000000001: -85.09523809523809, 6.897: -79.66666666666666, 6.930000000000001: -73.33333333333331, 6.963: -67.9047619047619, 6.996: -65.19047619047618, 7.029: -66.99999999999999, 7.062: -74.8095238095238, 7.095000000000001: -63.09523809523809}] #Let us check the data type print(f"\nThis is a {type(listVar)} data type") dictItem = listVar[0] #Let us check what is inside the list print(f"\nThis is a {type(listVar)} data type that contains {type(dictItem)} data type") numItem = list(dictItem.keys())[0] valueItem = list(dictItem.values())[0] print(f"""\nThis is a {type(listVar)} data type that contains {type(dictItem)} data type and the keys are {type(numItem)} data type""") print(f"""\nThis is a {type(listVar)} data type that contains {type(dictItem)} data type and the values are {type(valueItem)} data type""") listVarConverted = [{np.float64(i):np.float64(j) for i, j in item.items()} for item in listVar] print(listVarConverted) dictItem = listVarConverted[0] numItem = list(dictItem.keys())[0] valueItem = list(dictItem.values())[0] print(f"""\nThis is a {type(listVar)} data type that contains {type(dictItem)} data type and the keys are {type(numItem)} data type""") print(f"""\nThis is a {type(listVar)} data type that contains {type(dictItem)} data type and the values are {type(valueItem)} data type""") The output is as follows. This is a data type This is a data type that contains data type This is a data type that contains data type and the keys are data type This is a data type that contains data type and the values are data type ------------After conversion------------------------ This is a data type that contains data type and the k+eys are data type This is a data type that contains data type and the values are data type Now, let’s try to write this complex data into a yaml file. with open(r"D:\website\wixSite\articles\yaml\testYaml.yaml", "w")as wFile: yaml.dump(listVarConverted, wFile) And, the output is as follows. - ? !!python/object/apply:numpy.core.multiarray.scalar - &id001 !!python/object/apply:numpy.dtype args: - f8 - false - true state: !!python/tuple - 3 - < - null - null - null - -1 - -1 - 0 - !!binary | qvHSTWIQ5D8= : !!python/object/apply:numpy.core.multiarray.scalar - *id001 - !!binary | kiRJkiTJR8A= ? !!python/object/apply:numpy.core.multiarray.scalar - *id001 - !!binary | H4XrUbge5T8= : !!python/object/apply:numpy.core.multiarray.scalar - *id001 - !!binary | GIZhGIbhQcA= Here, the output is a bit different. It shows that it is a list of data. The ”?“ in "? !!python/object/apply:numpy.core.multiarray.scalar” shows that it is a dictionary data type or a mapping. This shows the data are np.float64 type. Here, the keys and values are in binary data format. Below is a mapping/ dictionary. ? !!python/object/apply:numpy.core.multiarray.scalar - *id001 - !!binary | lkOLbOf7G0A= : !!python/object/apply:numpy.core.multiarray.scalar - *id001 - !!binary | wjAMwzBMUMA= ? !!python/object/apply:numpy.core.multiarray.scalar - *id001 - !!binary | BFYOLbIdHEA= : !!python/object/apply:numpy.core.multiarray.scalar - *id001 - !!binary | //////+/UMA= *id001 is a yaml alias that generates the subsequent occurrence of data. The &id001, an yaml anchor has been defined at the beginning shows that - &id001 !!python/object/apply:numpy.dtype args: - f8 - false - true Here f8 represents a floating point number of 8 bytes (64bits). This is not a normal float value but numpy.float64 What the serialized yaml output is showing is that it is a list of dictionaries containing numpy data array. So, how can we properly serialize this complex data, if there are data items of np.float64? Here, we need to customize the complex data before serializing it. How to serialize the complex data that can not be normally serialized The first thing is to create a custom function, that programmers conventionally call “custom dumper function”. The custom dumper function takes two parameters: the “dumper” object of yaml module and the data to be serialized. Within the function, we need to manually convert the data object; for example, in the case discussed above, there are np.float64 data types. These data need to be converted to regular float. Once the data points are converted, we need to recreate the corrected version of the complex data. Once the corrected version was recreated, return it, using an appropriate method of the dumper object. For example, if you want to return a corrected version of a list, use the represent_list method of the dumper object. Later, using the add_representer method of yaml, register the new yaml representation of the complex data for serialization. Now, when yaml encounters similar complex data, the custom dumper function will be invoked, and the yaml representation of the complex data will be serialized, as specified in the custom function. Let’s look at the Python program to perform this. First create the custom representer function, with the dumper object and data being the parameters. def numpy_representer(dumper, data): Next, we need to initialize an empty list to store the corrected version of the complex data. def numpy_representer(dumper, data): """Initialize a new empty list that would store the modified dictionary.""" new_data = [] Now, Iterate through the list, and check if the items in the list is an instance of the dictionary object. If the items are of the type dictionary object, initialize an empty dictionary. def numpy_representer(dumper, data): """Initialize a new empty list that would store the modified dictionary.""" new_data = [] #iterate through each dictionary for item in data: #If the item is a dictionary if isinstance(item, dict): new_item = {} Iterate through the key, and value of each dictionary, and check if the key and value are instances of np.float64 data type. If true, convert to the regular float types, and add to the new empty dictionary. def numpy_representer(dumper, data): """Initialize a new empty list that would store the modified dictionary.""" new_data = [] #iterate through each dictionary for item in data: #If the item is a dictionary if isinstance(item, dict): new_item = {} #get the key and the value for key, value in item.items(): #if the value is a np.float64, convert to float. if isinstance(value, np.float64): key = float(key) new_item[key] = float(value) Once the keys and values in a dictionary are converted, then append the dictionary to the list. def numpy_representer(dumper, data): """Initialize a new empty list that would store the modified dictionary.""" new_data = [] #iterate through each dictionary for item in data: #If the item is a dictionary if isinstance(item, dict): new_item = {} #get the key and the value for key, value in item.items(): #if the value is a np.float64, convert to float. if isinstance(value, np.float64): key = float(key) new_item[key] = float(value) #otherwise don't convert, but add to the dictionary else: new_item[key] = value #When it is done, append the dictionary to the list new_data.append(new_item) Finally, return the new converted list using the represent_list method of the dumper object. def numpy_representer(dumper, data): """Initialize a new empty list that would store the modified dictionary.""" new_data = [] #iterate through each dictionary for item in data: #If the item is a dictionary if isinstance(item, dict): new_item = {} #get the key and the value for key, value in item.items(): #if the value is a np.float64, convert to float. if isinstance(value, np.float64): key = float(key) new_item[key] = float(value) #otherwise don't convert, but add to the dictionary else: new_item[key] = value #When it is done, append the dictionary to the list new_data.append(new_item) else: new_data.append(item) #returning the serialized data return dumper.represent_list(new_data) Now, register the yaml representation of the complex data, using the add_representer method of the yaml module. The add_representer method has two parameters: 1. the data type that needs to be considered for serialization, and the custom representer function that stores the yaml representation of the complex data. def numpy_representer(dumper, data): """Initialize a new empty list that would store the modified dictionary.""" new_data = [] #iterate through each dictionary for item in data: #If the item is a dictionary if isinstance(item, dict): new_item = {} #get the key and the value for key, value in item.items(): #if the value is a np.float64, convert to float. if isinstance(value, np.float64): key = float(key) new_item[key] = float(value) #otherwise don't convert, but add to the dictionary else: new_item[key] = value #When it is done, append the dictionary to the list new_data.append(new_item) else: new_data.append(item) #returning the serialized data return dumper.represent_list(new_data) # Add the custom representer to PyYAML for lists # If you see a list, perform the numpy_representer custom function yaml.add_representer(list, numpy_representer) Now, let us dump the complex data, using the dump method of the yaml module. If the complex data is of the structure specified in the add_representer(), it will be subjected to the custom representer function. with open(f"{path}/Dictionary_Data/dictLowThreshCorrected/dictLowThreshCorrected.yaml", "w") as wFile: yaml.dump(data1, wFile, default_flow_style=False)

  • RepeatMasker with Omics Box; an easy way to mask repetitive sequences

    Repetitive DNA sequences Repetitive DNA sequences are of two different types: low-complexity repeats and transposable elements. What are low-complexity repeats? Low-complexity repeats are homopolymeric runs of nucleotides. For example, it could be a homopolymeric stretch of Adenine (A), Thymine (T), Cytosine (C), or Guanine (G). A homopolymeric run of adenine looks like this: AAAAAAAAA Transposable elements These are DNA sequences that change positions in the genome. Transposable elements are also called jumping genes. Viral genome, long Interspersed Nuclear Element, and short Interspersed Nuclear Element are the major transposable elements. How do repetitive DNA sequences affect the genome assembly? Repetitive DNA sequences can pose problem while performing annotation of the protein-coding genes. When you don’t mask the repetitive sequences, it can generate many spurious BLAST alignments. Unmasked repeats align with similar non-homologous DNA sequences, thus producing non-reliable data. Gene annotation involves describing the location of genes in a genome. If you keep the repetitive DNA sequences unmasked, the gene annotation process can falsely show the evidence for a gene in a location where, in reality, there is no gene, but exists a repetitive element. Transposable elements are repetitive sequences that can move within a genome. These transposable elements have ORFs, which are sequences in genes that code for proteins. Some of the transposable elements’ ORFs resemble the host genes. Gene predictors are software tools that describe the location and structure of genes in a genome. If the transposable elements are left unmasked, the inaccurate gene predictors will falsely represent transposable elements’ ORFs as host genes. How to mask repetitive sequences in a genome? The repetitive sequences in the genome can be masked by using a software called RepeatMasker. What RepeatMasker does is that it scans the DNA sequences for transposable elements, satellites, and low-complexity repeats. There are two outputs. One is a comprehensive report of the annotation of the repeats in the query sequence, that comprises the different types of repetitive DNA elements found in the input DNA. The second output is the modified DNA whose repetitive DNA sequences are masked. By masking, as I mentioned above, the nucleotides of the repetitive elements are transformed into “N” or “X”, or their lowercase version. RepeatMasker also uses Tandem Repeat Finder (TRF) to find the tandem repeats. How does RepeatMasker get information about the repetitive elements RepeatMasker comes with the Dfam database. This Dfam database harbors information on Transposable Element families. As of Dfam version 3.7, there is information on 3,437,876 Transposable element families, spanning 2,306 species. Dfam website contains a collection of multiple sequence alignments. Each sequence alignment contains multiple members of a Transposable element family. By aligning these representative members of a Transposable element family, HMMs and a consensus sequence of Transposable element for each family has been made. Omics Box also works with RepBase which contains representative repetitive sequences from eukaryotic species. How to perform repeat masking using OmicsBox. The Repeat Masking functionality is under the Genome Analysis. Here, we have to select the input DNA sequences in the fasta format. Then we can set the analysis parameters. Setting the search configuration for Repeat Masker First of all, we need to select the search engine to perform the search for repeats. There are two search engines: 1. RMBlast 2. HMMER RMBlast is an NCBI BLAST-compatible version of Repeat Masker. Basically, it works with NCBI BLAST to find the repetitive sequence within the DNA sequence by searching the query sequence against a nucleotide database. What RMBlast does is it aligns the query DNA sequence against the repetitive sequences in databases like Dfam. It calculates alignment scores to find the similarity between the query DNA sequence and the repetitive sequence. Based on the degree of similarity between the query DNA sequence and the repetitive element, the RMBlast identifies the repetitive sequences and provides this information to Repeat Masker. Another search engine for homology search is HMMER. Using homologous sequences can be searched for the query sequence against a database of sequences, using the profile Hidden Markov Model can detect remote homologs as sensitively as possible, using an underlying probability model. HMMER is now as fast as BLAST. uses the nhmmer program to search one or more nucleotide queries against a database. Using the query sequence, search the target database of sequences, and output the top matches. Selection of Repetitive sequence database RepeatMasker works with a number of databases. Repeat Masker works with two databases at present. They are the Dfam database and RepBase database. Dfam is a database of Transposable elements. If it is selected as the database, then it is not necessary to provide a database file. But if you choose RepBase as the database, it is necessary that we need the database file loaded into the wizard in the EMBL format. The library should be RepBase RepeatMasker edition, and it can be downloaded from https://www.girinst.org/server/RepBase/. One can also use the custom option. In the custom option, one should provide a custom library of repetitive sequences to be masked in the query sequence. The library file should contain repetitive sequences in the fasta format. In the repetitive sequence library, the fasta IDs should be formatted as follows: “>repeatname#class/subclass” For example, let's see how to mask the repetitive elements under the subclass AluY. Alu elements are 300 bp and are classified under the Short Interspersed Nuclear Elements (SINE). So, when you create a custom library for an element, let’s say belonging to AluY, then the fasta should be of the format: >Alu#SINE/AluY Here, Alu is the name of the repetitive element. SINE stands for the name of the class to which the repetitive element Alu belongs. AluY is the name of the subclass. Let us look at another example. Let’s imagine that we have done a genome assembly of a plant species, let's call it Chloropicon primus. The consensus sequence of the transposable element, downloaded from the plant repetitive element database, PlantRep (PlantRep) is as follows. >rnd-2_family-71#DNA/MULE-MuDR ( Recon Family Size = 28, Final Multiple Alignment Size = 24 ) GAGGATTGCANAAGAGGGGCGAAGTNCTNCGATCACCAGCACGTCGTCGA NTGGATCGAGGCNAATTCTTAACCAAGAACAATGTCTCACGACGAACCTG TCATCGACCTTCGCTTCCTCTTCCGATTCCTCCTCCTCCTCTCTCCTTTG CTCTTCCTCCTTCGCTCTTTCTCTTCCGGTGGATCTTCTCCCTCCCGCTG AACCATCAAAACGGCCCGAGGCCACGAGGGGCGAACGAGGACGAGGNGAG AGAGAAACGAAGAGGACGGGAGCGCGCGGAGGAGGGAGAGAGAGAGAGAG AAGNGGAGGACAGGAAAGAAGGAAGCGTCGTCTCNTGCTCTTTCGAACGA GCCCTCGCGCGAAAGAAACGACCCAGTGGCGAGGATCTGGCGACGCGAAA CGCGAAGAAGAGAGGCAGAAAGGAATCGAGGAGTAGATCACCGAGGAAGG Specifying the name of the species One can specify the name of the species. This name must be present in the NCBI Taxonomy database. If you use HMMER as the search engine, it uses Dfam database to search for repetitive sequences. In Dfam database, information on repetitive sequences for organisms such as human beings, Caenorhabditis elegans, mouse, zebrafish, fruit fly, and nematode are present. RMBlast option If RMBlast option is used as the search engine, we need to specify an option for the speed/ sensitivity parameter. There are three options: Rush: When you choose Rush, the speed of database search, alignment, repeat masking and annotation will be 4 - 10 faster than the default option. But it will be 10% less sensitive than the default option. Quick: When you choose this option, the process will be 5 - 10% less sensitive, but will be 2 - 5 times faster than the default process. Slow: When you choose this option, the process will be 0 - 5 times more sensitive, but 2 - 3 times slower than the default. Divergence Cut-off Divergence cut-off can be applied. When you turn on this option, the RepeatMasker will mask the repetitive sequences that are less diverged from the consensus than the value we provide as a divergence cut-off. RepBase, PlantBase, Dfam, etc. repeat databases contain consensus repeat sequences. The repetitive elements are likely to have mutations, and they may diverge from the consensus. When you provide a divergence cut-off value, let's say 20%, the RepeatMasker will mask the repetitive sequence if it is less than 20% diverged from the consensus sequence in the database or in the library file. When you run the commandline of the RepeatMasker as follows RepeatMasker [-options] you can also add the -div parameter where you can mention the percentage divergence; the repeats that have been diverged less than the cut-off percentage value will be masked. Output options There is a set of parameters that need to be provided with a value. Masking option: Here, we need to specify how the repetitive sequences should be masked. Bases of the repetitive sequences can be replaced with "N" or "X" or the lowercase version of the respective bases. Only Alu elements: If the query DNA is from a primate, this option can be used. Alu repeats are 300bp long repeats of the class SINE. Type of Repeats Here, we can set the type of repeats that the RepeatMasker should mask. For example, you may set Interspersed repeats, simple repeats, and low-complexity repeats. Output The fasta output file contains the masked sequence in fasta format. The locations of the masked repeats are provided in the GFF format.

  • Pornography Addiction and Dhat Syndrome: A Case Study

    About porn addiction and dhat syndrome. A case study about how porn addiction causes compulsive masturbation and social withdrawal.

  • Will an LVM partition help me achieve an increased storage space? What is the success rate?

    I was working on a genome assembly project when I was caught off guard by an issue: the assembly process was terminated due to insufficient space on the hard disk. I was assembling the linked reads using Universal Sequencing Technology's TELLYSIS pipeline. After consulting with the company, I was informed that I needed at least 1024 GB of free space before starting the assembly. I am using an IBM server installed with Ubuntu distro, version 22.04. This server is a decade old but has recently undergone an upgrade, increasing its memory capacity to an impressive 248 GB. This ten-year-old machine is equipped with eight SAS hard disks, each with a capacity of 300 GB. This theoretically scales up to 8x300 = 2400 GB or 2.4 TB of storage capacity. However, there are slots for 12 physical disks, of which four slots cannot accommodate additional disks due to a lack of adequate SAS connectivity on the motherboard. A quick and technically less demanding solution I considered was to replace one of the existing 300 GB SAS disks with a higher-capacity one. Upon further investigation and communication with local procurement services, I discovered that the SAS disks compatible with my machine are quite rare in the market, and the available ones are rather expensive. As mentioned earlier, there are eight SAS disks, each with a 300 GB capacity, so the total storage capacity should amount to 2400 GB. However, the system's usable storage capacity is only 1800 GB. Later, it became clear that two sets of four hard disks had been configured in a RAID 5 setup. When four hard disks are RAID 5 configured, a pooled storage capacity equivalent to three hard disks is available for use. In the event of disk failure, the fourth disk acts as a backup. One set of four hard disks is configured as Drive 0, and the next set of four disks is similarly configured as Drive 1. This means that when you write a file to a disk, it is not immediately clear which one of the four disks it is on, or in whose clusters the file was registered. Drive 0 has two partitions: /dev/sda1 and /dev/sda2. Similarly, Drive 1 has two partitions: /dev/sdb1 and /dev/sdb2. Both partitions are created with an ext4 filesystem. While /dev/sda2 is a /boot/ partition, /dev/sdb2 is a general partition. The capacity of /dev/sda2 is 835 GB, and that of /dev/sdb2 is 870 GB. In each of these partitions, there is approximately 750 GB of free space, which is less than the free space required for running the genome assembly. To meet the requirement, I need to install an additional hard disk with a storage capacity of more than 1024 GB. Since these partitions belong to two distinct drives, resizing or merging the partitions is not as straightforward as it is in the Windows OS. This is when I came across LVM (Logical Volume Management) technology. Using LVM, multiple physical disks can be merged to create a logical volume partition with the pooled storage capacity of the disks involved. LVM stands for Large Volume Management. With this technology, physical volumes can be created using desired partitions, a volume group can be formed from these physical volumes, and finally, a logical volume of the desired size can be created from the volume group. The storage capacity of the logical volume partition is accounted for by multiple physical disks. However, there is a risk involved. If any of the physical disks that are part of the volume group fail, the entire volume group fails, rendering the data irrecoverable. So, caution should be exercised, and data backup is essential. However I have a backup, and I want to proceed with creating an LVM partition on the existing physical disks in my system. Here's a simplified protocol for the process: 1. Boot your system from an Ubuntu bootable USB drive. 2. Delete the partitions you want to use for creating Physical Volumes (in this case, /dev/sda2 and /dev/sdb2). 3. Create new primary partitions and convert them into "LVM type." 4. Create a Physical Volume using these partitions. 5. Create a volume group involving /dev/sda2 and /dev/sdb2. 6. Create a logical volume partition of the combined desired size based on the storage capacity of /dev/sda2 and /dev/sdb2 within the volume group. 7. Create an ext4 filesystem for the newly created logical volume. 8. Mount the new logical volume partition. I have yet to implement these steps to create the LVM partition, and I have concerns about the success rate given the age of the SAS disks in the system. What do you think? How about I install Linux on a Virtual Machine using VirtualBox? Can I reduce the risk of data loss in case the logical volume somehow gets damaged? Share your thoughts!

  • Article: The Chromium de novo assembly solution.

    The de novo assembly solution Chromium de novo solution helps to generate a diploid assembly of a diploid genome, which means it generates assembly of both alleles. 10x Linked-reads are generated from a single DNA library using the Chromium’s de novo whole genome library preparation process. Is 10x genomics an easy way to get your de novo genome assembly done? Getting a genome assembled - especially of higher animals or the one that contains more repeats - through an assembly process is not very easy. It is a long and arduous process; sometimes, it may involve an additional upstream process like inbreeding to get a good DNA sample. And once the reads are ready, then a bioinformatician has to assemble the reads. In this context, we have 10x genomics, whose laboratory and bioinformatics processes are a turnkey and cookbook. They provide easy to follow instructions on DNA extraction, library preparation, and sequencing. Once the sequencing is completed, 10x genomics’ assembly software Supernova 2.1.0 can be run to assemble the genome. Supernova 2.1.0 does expect the user to be very proficient in the fundamentals of computer programming; all that the software requires is a batch of all fastq files associated with your library. Chromium de novo assembly preparation does not require much DNA sample. For chromium de novo assembly preparation, approximately 1ng of DNA is required. This means, you do not need to perform inbreeding to clonally select samples, so that you can avoid induction of any heterozygosity, or any complication associated with the mixing of wild samples. For library preparation, for organisms of genome size less than 1.6 gb, 10x genomics library preparation protocols recommends a loading mass of 0.6ng of DNA. if genome size is in the range of 1.6 - 3.2 gb, the loading mass of DNA is supposed to be interpolated between 0.6 and 1.2 ng. For genome size between 3.2Gb to 4.0Gb, the required DNA, to be loaded is 1.2 ng. Any genome of size 4.0 Gb is not recommended. Reduced cost of sequencing. Chromium de novo assembly solution costs much less as compared to other sequencing sequencing technologies, especially long-read sequencing technology. Since it requires a very low quantity of DNA (at nanograms level), and no upstream process involved, the cost is less. So, very importantly, the Chromium de novo library is sequenced using Illumina’s low -cost platforms like NovaSeq, HiSeq X, HiSeq 2500, and therefore the sequencing cost is not very exorbitant. Besides the monitory investment, 10x genomics’ linked-reads can be assembled using the Supernova 2.1.0 software package, which is less programmatically intense. How to make Supernova work for your genome of interest? Supernova has been tested on a wide repertoire of organisms. The scope of applicability of Supernova has been characterized by testing on genomes that vary in multitude features. the smallest genome tested using Supernova is of 140Mb size. And, the largest genome is of 3.2Gb size. A genome of size more than 4.0Gb should be considered experimental because it has not yet been tested out using Supernova. While sequencing genome, it is recommended to have a genome coverage within the range of 38 - 56X. If you know the genome size, one can manually calculate the number of reads that correspond to a recommended coverage. It is also recommended to to have reads not more than 2.14 billion. Also, optimally, the read length should be 150bp. For instance, in my case, the total number of reads generated was 378 million, for a genome of size 2.6 Gb - as estimated by Supernova in the preliminary run. The TELL-Seq library was sequenced using the Illumina’s NovaSeq 6000 model, and the Linked reads are of 150bp. From these information, I was able to calculate the genome coverage, and it as around 21X. This is far lower than the recommended lowest coverage of 38X, and Supernova is not recommended for the assembly. Preparation of long, undamaged DNA is important for a good assembly. We need DNA of a single individual, and DNA from clonal population can also be used. While trying to isolate DNA from the clonal population, it is important to note that DNA from wild individuals should not get mixed. An upstream process of inbreeding is not recommended because only few nanograms of DNA is required. Normally, an intact long DNA molecule can be easily made from the sample types of cell lines and blood. Long genomic DNA molecules, resulting from the fragmentation process, are used for the generation of barcoded-linked reads. These long genomic DNA molecules trapped in a Gel EMulsion beads (GEM beads), and the short reads that generate from them are clonally barcoded. The length of the genomic DNA molecule is key here. This dictates the quality of the assembly. If the length of the genomic DNA molecule is less than 50 Kb, then it is problematic. If it is less than 20Kb, it is highly problematic. For example, in our TELL-Seq library preparation, the genomic DNA molecule had an average length of 16 Kb, with per DNA reads are around 3 -4 in number. This is a serious problematic situation, and can result in a not-so-good quality assembly.

  • Twitter Round
  • b-facebook

© 2035 by Z-Photography. Powered and secured by Wix

PHONE:
+91 9645304093

EMAIL:
thelifesciencehub2023@gmail.com
Subscribe to our website and receive updates!

Thanks for submitting!

Lifescience, C/o Vijithkumar, Dept. of Biotechnology,

Manonmaniam Sundaranar University, Tirunelveli, Tamil Nadu, India, PIN: 627012.

bottom of page