C++ 模拟数据库优化我的代码_C++

C++ 模拟数据库优化我的代码

c++

C++ 模拟数据库优化我的代码,c++,C++,我一直在做一个程序，模拟一个可以进行查询的小数据库，在编写代码后，我执行了它，但性能相当差。它工作得很慢。我试着改进它，但是我几个月前就开始用C++来做，所以我的知识仍然很低。所以我想找到一个解决方案来提高性能让我解释一下我的代码是如何工作的。在这里，我总结了我的代码是如何工作的首先，我有一个.txt文件，它模拟一个数据库表，其中包含以“|”分隔的随机字符串。这里有一个表的示例（有5行5列） Table.txt txt文件中的信息由我的程序读取并存储在计算机内存中。然后，在进行查询时，我将访

我一直在做一个程序，模拟一个可以进行查询的小数据库，在编写代码后，我执行了它，但性能相当差。它工作得很慢。我试着改进它，但是我几个月前就开始用C++来做，所以我的知识仍然很低。所以我想找到一个解决方案来提高性能

让我解释一下我的代码是如何工作的。在这里，我总结了我的代码是如何工作的

首先，我有一个.txt文件，它模拟一个数据库表，其中包含以“|”分隔的随机字符串。这里有一个表的示例（有5行5列）

Table.txt

txt文件中的信息由我的程序读取并存储在计算机内存中。然后，在进行查询时，我将访问存储在计算机内存中的这些信息。将数据加载到计算机内存可能是一个缓慢的过程，但稍后访问数据会更快，这才是真正重要的

这里有一部分代码从文件中读取这些信息并存储在计算机中

从Table.txt文件读取数据并将其存储在计算机内存中的代码

string ruta_base（“C:\\a\\Table.txt”）；//找到我的“Table.txt”的文件夹
字符串温度；//变量，其中首先存储Table.txt文件中的每一行
向量缓冲区；//变量，其中通过标记分隔不同元素后，将存储每个不同的行。
向量RowsCols；//我创建了一个类，它模拟了一个向量，每个向量元素都是我表中的一行
ifstream ifs（ruta_base.c_str（））；
while（getline（ifs，temp））//我们将每行读取并存储一行，直到“.txt”文件结束。
{
size_t tokenPosition=temp.find（“|”）//当我们找到simbol“|”时，我们将标识不同的元素。因此，我们将字符串temp分离为将存储在向量缓冲区中的标记
while（tokenPosition！=string:：npos）
{    
字符串元素；
tokenPosition=temp.find（“|”）；
元素=温度子元素（0，标记位置）；
缓冲器。推回（元件）；
温度擦除（0，标记位置+1）；
}
元素集ss（0，缓冲区）；
buffer.clear（）；
RowsCols.push_back（ss）；//我们将每行的所有元素（存储为向量缓冲区）存储在“RowsCols”中的不同位置
}
向量表描述符；
表存储（RowsCols）；
tableDescriptor.push_back（表存储）；
数据库（1，表描述符）；

然后是重要的部分。假设我想进行一个查询，并请求输入。假设我的查询是行“n”，连续元组是“numTuples”，列是“y”。（我们必须说，列数由十进制数字“y”定义，该数字将转换为二进制，并向我们显示要查询的列，例如，如果我要求第54列（00110110二进制），我将要求第2、3、5和6列）。然后我访问计算机内存以获取所需信息，并将其存储在向量ShowInvector中。在这里，我向您展示这段代码的一部分

根据我的输入访问所需信息的代码

int n，numTuples；
无符号长整型；
时钟t1，t2；
cout>n；//我们得到要表示的行->“n”
cout>numTuples；//我们得到要查询的后续元组数->“numTuples”
库蒂；//我们将“列”表示为“y”
无符号整数r；//列路径的辅助变量
int t=0；//元组路径的辅助变量
int idTable；
向量列stobequeryd；//在这里，我们将在与掩码进行比较后，存储从位集binarynumber获取的要查询的列
向量shownVector；//Vector来存储查询中的最终信息
位集掩码；
掩码=0x1；
t1=时钟（）；//开始查询时间
bitset binaryNumber=Utilities（）.getDecToBin（y）；//我们得到列->将数字从十进制改为二进制。最大列数：5000
//我们可以看到哪些列将被查询
对于（r=0；r每当您遇到性能问题时，您要做的第一件事就是分析您的代码。这是一个可以在windows和linux上进行分析的免费工具列表。分析您的代码，找出瓶颈，然后回来问一个特定的问题
另外，正如我在评论中所说的，你不能直接使用吗？它支持内存中的数据库，使其适合测试，而且它是轻量级和快速的。我还没有完成这项工作，但你可以分析算法的复杂性。
表示访问项目的时间是恒定的，但当您创建循环时，程序的复杂性会增加：
for (i=0;i<1000; ++i) // O(i)
  for (j=0;j<1000; ++j) // O(j)
     myAction(); // Constant in your case

<代码> >（i＝0；i＜p＞），不需要重新创建轮子，而是使用FieldBSQL嵌入式数据库。与Ipppp C++接口相结合，为将来的需求提供了良好的基础。

一个明显的问题是get函数按值返回向量。每次都需要一个新的副本吗？可能不需要
如果尝试返回常量引用，则可以避免大量副本：
const vector&getPointer（）；

与嵌套get类似。
尽管我建议您使用探查器来找出代码中哪些部分值得优化，但下面是我编写程序的方法：
将整个文本文件读入一个字符串（或者更好，内存映射文件）。扫描该字符串一次以查找所有|和\n（换行符）字符。此扫描的结果是字符串中的字节偏移数组
然后，当用户查询第N行的M项时，使用如下代码检索它：
char* begin = text+offset[N*items+M]+1; 
char* end = text+offset[N*items+M+1];

如果在读取数据之前知道记录和字段的数量，字节偏移量数组可以是std:：vector。如果不知道并且必须从数据中推断，则应该是std:：deque。这是为了最大限度地降低成本
int n, numTuples; 
unsigned long long int y;
clock_t t1, t2;

cout<< "Write the ID of the row you want to get more information: " ;
cin>>n; // We get the row to be represented -> "n"

cout<< "Write the number of followed tuples to be queried: " ;
cin>>numTuples; // We get the number of followed tuples to be queried-> "numTuples"

cout<<"Write the ID of the 'columns' you want to get more information: ";
cin>>y; // We get the "columns" to be represented ' "y"

unsigned int r; // Auxiliar variable for the columns path
int t=0; // Auxiliar variable for the tuples path
int idTable;

vector<int> columnsToBeQueried; // Here we will store the columns to be queried get from the bitset<500> binarynumber, after comparing with a mask
vector<string> shownVector; // Vector to store the final information from the query
bitset<500> mask;
mask=0x1;

t1=clock(); // Start of the query time

bitset<500> binaryNumber = Utilities().getDecToBin(y); // We get the columns -> change number from decimal to binary. Max number of columns: 5000

// We see which columns will be queried
for(r=0;r<binaryNumber.size();r++) //
{               
    if(binaryNumber.test(r) & mask.test(r))  // if both of them are bit "1"
    {
        columnsToBeQueried.push_back(r);
    }
    mask=mask<<1;   
}

do
{
    for(int z=0;z<columnsToBeQueried.size();z++)
    {
        int i;
        i=columnsToBeQueried.at(z);

        vector<int> colTab;
        colTab.push_back(1); // Don't really worry about this

        //idTable = colTab.at(i);   // We identify in which table (with the id) is column_i
        // In this simple example we only have one table, so don't worry about this

        const Table& selectedTable = database.getPointer().at(0); // It simmulates a vector with pointers to different tables that compose the database, but our example database only have one table, so don't worry            ElementSet selectedElementSet;

        ElementSet selectedElementSet;

        selectedElementSet=selectedTable.getRowsCols().at(n);
        shownVector.push_back(selectedElementSet.getElements().at(i)); // We save in the vector shownVector the element "i" of the row "n"

    }   
    n=n+1;
    t++;            

}while(t<numTuples);

t2=clock(); // End of the query time

float diff ((float)t2-(float)t1);
float microseconds = diff / CLOCKS_PER_SEC*1000000;

cout<<"The query time is: "<<microseconds<<" microseconds."<<endl;

class ElementSet
{
private:
    int id;
    vector<string> elements; 

public:
    ElementSet(); 
    ElementSet(int, vector<string>); 

    const int& getId();
    void setId(int);

    const vector<string>& getElements();
    void setElements(vector<string>);

};

class Table
{
private:
    vector<ElementSet> RowsCols; 

public:
    Table(); 
    Table(vector<ElementSet>); 

    const vector<ElementSet>& getRowsCols();
    void setRowsCols(vector<ElementSet>);
};


class DataBase
{
     private:
        int id;
        vector<Table> pointer; 

     public:
        DataBase(); 
        DataBase(int, vector<Table>); 

    const int& getId();
    void setId(int);

    const vector<Table>& getPointer();
    void setPointer(vector<Table>);

    };

class Utilities
{
        public:
        Utilities();
        static bitset<500> getDecToBin(unsigned long long int);
};

for (i=0;i<1000; ++i) // O(i)
  for (j=0;j<1000; ++j) // O(j)
     myAction(); // Constant in your case

char* begin = text+offset[N*items+M]+1; 
char* end = text+offset[N*items+M+1];